This research project is based on the umbrella project “Pandemic Emergency in Social Perspective. Evidence from a large Web-survey research”, designed and organized by principal investigators Linda Lombi (Università Cattolica del Sacro Cuore, Milan) and Marco Terraneo (Università Bicocca-Milano).
The principal goal of the international cross-sectional study is to explore the predictors of depression within the European context of the Covid-19 pandemic, specifically during the lockdown and social distancing period of March-April 2020.
Our team has decided to primarily focus on the impact of modifiable behavioral/lifestyle factors, such as exercise, alcohol and tobacco consumption, but, also, the usage of social media as a source of information about the pandemic. Our intention is to create and validate a depression model that these literature-based predictors should predict. Furthermore, we intend to explore the indirect pathway between social media consumption and depression mediated by the level of Covid-19-related concern/anxiety.
Supplementary data for this project, such as the survey questionnaire, original dataset and other key documents are accessible in our Open Science Framework repository. The R Markdown code is also acessible on our GitHub repository.
Given the rapidly-developing nature of the Covid-19 pandemic, the principal research team (Lombi & Terraneo) chose a convenience sample, recruited through Facebook national groups using a snowballing technique. The goal was to collect at least 1000 responses per country.
The data collection has been conducted between March-April 2020 in the following eight countries: Italy, France, Germany, Spain, United Kingdom, Sweden, Poland, Czech Republic and were conducted by the members of the respective national teams (please see the research protocol in the OSF repository.
This relatively non-random sampling is likely to result in a non-representative sample for the national populations. This is one of the limitations of this research and is reflected in the “data collection and sampling” part of the research protocol outlined by Linda Lombi and Marco Terraneo.
This approach, therefore, does not aim to compare country-samples, but, rather, to compare segments of the national samples, with a particular focus on the vulnerable social groups, determined by socio-demographic, lifestyle professional and living condition aspects.
In order to comply with the principles of Open Science, we intend to split our analysis to two parts.
COV19_05_agroup.sav) inductively and consider the formulation of additional hypotheses for other predictors that might have been missed before the beginning of the study. To lower the chance of overfitting, we only consider the adding additional variables that have an empirical support based on our review of the existing literature. Towards the end of the first part of the project, we pre-register our hypotheses and other key research information (including this reproducible R code) at the OSF Registries. While some of the team members have briefly interacted with the international dataset, they have not been involved in the pre-registration and hypothesis forming process in order to reduce biases by separating the exploratory and confirmatory phases of the research.| Alternative Hypotheses | Variable | Literature |
|---|---|---|
| H1: Female gender is associated with higher levels of depression. | q01 | (Salk, Hyde, and Abramson 2017; Kowal et al. 2020; Wang, Pan, Wan, Tan, Xu, McIntyre, et al. 2020; Luo et al. 2020; González-Sanguino et al. 2020) |
| H2: Higher age is associated with lower levels of depression. | q02 | (Kowal et al. 2020; Shevlin et al. 2020; Taylor et al. 2008; Losada-Baltar et al. 2020; González-Sanguino et al. 2020; Carstensen 2006) |
| H3: People in a relationship experience lower levels of depression. | q03 | (Kowal et al. 2020; Jacob, Haro, and Koyanagi 2019) |
| H4: Parenthood is associated with significantly different levels of depression. | q04 | (Stanca 2012; Shevlin et al. 2020) |
| H5: Higher education is associated with lower levels of depression. | q11 | (Kowal et al. 2020; Gloster et al. 2020; Taylor et al. 2008) |
| H6: Use of social media is associated with higher levels of depression. | q18_02 | (Bendau et al. 2020; Dhir et al. 2018; Primack et al. 2017) |
| H7: Physical contact with friends and family is associated with lower levels of depression. | q35_01, q35_03 | (Gloster et al. 2020; Tull et al. 2020; Luo et al. 2020) |
| H8: Regular consumption of alcohol and tobacco is associated with higher levels of depression. | q38, q40 | (Stanton et al. 2020; Awaworyi Churchill and Farrell 2017) |
| H9: Regular workouts or physical activity are associated with lower levels of depression. | q42 | (Harvey et al. 2018; Schuch et al. 2016; Kvam et al. 2016; Krogh et al. 2017; Stubbs et al. 2018) |
| H10: Worse self-rated health quality is associated with higher levels of depression. | q47, q48, q47 | (Ambresin et al. 2014; Vindegaard and Benros 2020; Hossain et al. 2020) |
| H11: Adequate level of public information about Covid-19 transmission and precautionary measures to prevent its spread (hand washing and mask wearing) is associated with lower levels of depression. | q20, 34_02, 34_07 | (Wang, Pan, Wan, Tan, Xu, Ho, et al. 2020; Wang, Pan, Wan, Tan, Xu, McIntyre, et al. 2020) |
| H12: Economic distress is associated with higher levels of depression. | q36 | (Meltzer et al. 2009) |
| H13: In addition to H6, we hypothesize the existence of a causal pathway leading from social media exposure to depression, which is mediated by Covid-19 concern/anxiety and moderated by age and gender. | q01, q02, q18_02, concern_index | (Bendau et al. 2020; Rasmussen et al. 2020; Wheaton, Prikhidko, and Messner 2021; Vannucci, Flannery, and Ohannessian 2017; Mertens et al. 2020) |
The code below can be run in R or in R IDE, such as R Studio. We used R Markdown within the R Studio to compose this report and used the open-source jamovi software (a R GUI) to conduct some of the exploratory analyses that are then replicated here.
# The following packages might need to be installed onto your version
# of R prior to the running of the code below.
# Package names
packages <- c("udpipe", "MASS", "lavaan", "processR", "wordcloud", "corrplot", "tidytext", "tidyverse", "haven", "jmv", "Hmisc")
# Install packages not yet installed
installed_packages <- packages %in% rownames(installed.packages())
if (any(installed_packages == FALSE)) {
install.packages(packages[!installed_packages])
}
# Packages loading
invisible(lapply(packages, library, character.only = TRUE))
# We load the original Czech dataset (in SPSS format) from a local directory.
data <- zap_labels(haven::read_sav(file = "COV19_05_agroup.sav"))
# For use in correlation analysis, we duplicate the dataset under name data_corr
data_corr <- data
# We also try to limit the decimals to three significant figures
options(digits = 3, scipen = 999)
# Firstly, because the source file is an SPSS file, we need to specify that we
# would like to see value labels (such as Male/Female) for selected variables, as
# opposed to just numeric values (such as 1/2). This is not essential for
# the analysis, but seeing the names of labels will enable better understanding
# of the results. We also rename key variables to a more human-readable form,
# while also renaming variables related to Covid-19 concerns, which we will use
# to construct the Covid-19 concern index with factor analysis (to use for
# path analysis afterwards). Finally, for convenience, we translate the core
# variables labels from Czech to English.
data <- data %>%
transmute(id = RespondentID,
q01_gender = recode_factor(as_factor(q01),
`1` = "female",
`2` = "male"),
q02_age = q02,
q02_age_group = recode_factor(as_factor(Q4_AGE_r),
`1` = "16-29 years",
`2` = "30-49 years",
`3` = "50-64 years",
`4` = "65+"),
q03_relationship_type = recode_factor(as_factor(q03),
`1` = "single",
`2` = "relationship",
`3` = "married",
`4` = "divorced",
`5` = "widowed"),
q04_children = recode_factor(as_factor(q04),
`1` = "yes",
`2` = "no"),
q11_education = recode_factor(as_factor(q11),
`1` = "unfin_element",
`2` = "element",
`3` = "unfin_hs",
`4` = "hs",
`5` = "undergrad",
`6` = "postgrad"),
q18_02_soc_media = recode_factor(as_factor(replace_na(q18_02, 0)),
`0` = "no",
`1` = "yes"),
q20_public_info = recode_factor(as_factor(q20),
`1` = "yes",
`2` = "no",
`3` = "do_not_know"),
q34_02_face_mask = recode_factor(as_factor(q34_02),
`1` = "yes",
`2` = "no"),
q34_07_hand_washing = recode_factor(as_factor(q34_07),
`1` = "yes",
`2` = "no"),
q35_01_contact_close_family = recode_factor(as_factor(q35_01),
`1` = "less_often",
`2` = "as_before",
`3` = "more_often"),
q35_03_contact_friends = recode_factor(as_factor(q35_03),
`1` = "less_often",
`2` = "as_before",
`3` = "more_often"),
q36_econ_worry = recode_factor(as_factor(q36),
`1` = "very_serious",
`2` = "serious",
`3` = "limited"),
q38_alcohol = recode_factor(as_factor(q38),
`1` = "yes",
`2` = "no"),
q40_smoking = recode_factor(as_factor(q40),
`1` = "yes",
`2` = "no"),
q42_sport = recode_factor(as_factor(q42),
`1` = "yes",
`2` = "no"),
q47_self_reporting_health = recode_factor(as_factor(q47),
`1` = "excellent",
`2` = "good",
`3` = "neutral",
`4` = "bad",
`5` = "very_bad"),
q48_chronic_illness = recode_factor(as_factor(q48),
`1` = "yes",
`2` = "no"),
q49_health_limitations = recode_factor(as_factor(q49),
`1` = "limits",
`2` = "partially_limits",
`3` = "no_limits"),
q30_concern_infection_covid = q30,
q31_concern_infection_friends = q31,
q33_01_concern_situation = q33_01,
q33_02_concern_low_control = q33_02,
q33_03_concern_survival_covid = q33_03,
q33_04_concern_change_employment = q33_04,
q33_05_concern_infecting_others = q33_05,
PHQ8 = PHQ8,
q50_comment = q50)
kableExtra::kbl(head(data),
caption = "The overview of the structure of the dataset and its key variables") %>%
kableExtra::kable_classic(lightable_options = c("striped")) %>%
kableExtra::scroll_box(width = "830px", height = "100%")
| id | q01_gender | q02_age | q02_age_group | q03_relationship_type | q04_children | q11_education | q18_02_soc_media | q20_public_info | q34_02_face_mask | q34_07_hand_washing | q35_01_contact_close_family | q35_03_contact_friends | q36_econ_worry | q38_alcohol | q40_smoking | q42_sport | q47_self_reporting_health | q48_chronic_illness | q49_health_limitations | q30_concern_infection_covid | q31_concern_infection_friends | q33_01_concern_situation | q33_02_concern_low_control | q33_03_concern_survival_covid | q33_04_concern_change_employment | q33_05_concern_infecting_others | PHQ8 | q50_comment |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1115 | female | 69 | 65+ | widowed | no | hs | no | yes | yes | yes | less_often | less_often | limited | no | no | no | NA | NA | NA | 1 | 1 | 2 | 2 | 1 | 1 | 1 | 3 | |
| 349 | female | 37 | 30-49 years | single | no | undergrad | no | yes | yes | yes | as_before | less_often | serious | no | no | no | NA | NA | NA | 1 | 1 | 1 | 5 | 1 | 1 | 1 | 7 | |
| 1907 | female | 23 | 16-29 years | single | no | undergrad | no | yes | yes | yes | less_often | less_often | very_serious | yes | no | yes | NA | NA | NA | 5 | 7 | 8 | 3 | 3 | 1 | 10 | 12 | |
| 1083 | female | 20 | 16-29 years | single | no | hs | yes | yes | yes | yes | more_often | as_before | limited | yes | no | yes | NA | NA | NA | 4 | 6 | 7 | 7 | 6 | 6 | 5 | 13 | |
| 911 | female | 72 | 65+ | widowed | yes | hs | no | yes | yes | yes | less_often | less_often | serious | no | no | no | NA | NA | NA | 10 | 10 | 9 | 9 | 9 | 1 | 9 | 15 | |
| 1379 | female | 19 | 16-29 years | relationship | no | element | yes | no | yes | yes | less_often | more_often | serious | yes | yes | yes | NA | NA | NA | 3 | 8 | 10 | 5 | 2 | 3 | 10 | 8 |
The PHQ8dependent variable intend to determine the presence and severity of major depressive disorder. The PHQ-8 index construction is standardized and based on the established methodology (Kroenke et al. 2009). The PHQ-8 questionnaire asks the number of days in the past 2 weeks the respondent had experienced a specific depressive symptom.
This variable was recoded by the international team from 8 survey items (see the OSF project page for the precise syntax) and is thus already present in the version of this dataset.
Since we are using several linear models in this report, whose assumption is normal distribution of the residuals, we could benefit from the power transformation of our dependent variable PHQ8 (using Yeo-Johnson function). We name this transformed variable PHQ8_t.
# To summarize the dependent continuous variable, we use the descriptives()
# function from the jmv package.
descriptives <- jmv::descriptives(
data = data,
vars = "PHQ8",
freq = TRUE,
box = TRUE,
median = FALSE,
range = TRUE,
sd = TRUE,
pc = TRUE)
descriptives$plots
descriptives$descriptives
Descriptives
──────────────────────────────
PHQ8
──────────────────────────────
N 1484
Missing 0
Mean 4.71
Standard deviation 4.62
Range 24.0
Minimum 0.00
Maximum 24.0
25th percentile 1.00
50th percentile 3.00
75th percentile 7.00
──────────────────────────────
In the next step, we asses the demographic characteristics of the respondents in the Czech sample.
# To summarize the key demographic variables, we use the descriptives()
# function from the jmv package.
demo_descriptives <- jmv::descriptives(
data = data,
vars = vars("q01_gender",
"q02_age_group",
"q03_relationship_type",
"q04_children",
"q11_education"),
bar = TRUE,
freq = TRUE,
missing = FALSE,
mean = FALSE,
median = FALSE,
sd = FALSE,
min = FALSE,
max = FALSE)
demo_descriptives$plots
demo_descriptives$frequencies
FREQUENCIES
Frequencies of q01_gender
──────────────────────────────────────────────────
Levels Counts % of Total Cumulative %
──────────────────────────────────────────────────
female 1054 71.0 71.0
male 430 29.0 100.0
──────────────────────────────────────────────────
Frequencies of q02_age_group
───────────────────────────────────────────────────────
Levels Counts % of Total Cumulative %
───────────────────────────────────────────────────────
16-29 years 379 25.5 25.5
30-49 years 440 29.6 55.2
50-64 years 206 13.9 69.1
65+ 459 30.9 100.0
───────────────────────────────────────────────────────
Frequencies of q03_relationship_type
────────────────────────────────────────────────────────
Levels Counts % of Total Cumulative %
────────────────────────────────────────────────────────
single 332 22.4 22.4
relationship 283 19.1 41.4
married 586 39.5 80.9
divorced 155 10.4 91.4
widowed 128 8.6 100.0
────────────────────────────────────────────────────────
Frequencies of q04_children
──────────────────────────────────────────────────
Levels Counts % of Total Cumulative %
──────────────────────────────────────────────────
yes 937 63.1 63.1
no 547 36.9 100.0
──────────────────────────────────────────────────
Frequencies of q11_education
─────────────────────────────────────────────────────────
Levels Counts % of Total Cumulative %
─────────────────────────────────────────────────────────
unfin_element 5 0.3 0.3
element 109 7.3 7.7
unfin_hs 74 5.0 12.7
hs 537 36.2 48.9
undergrad 152 10.2 59.1
postgrad 607 40.9 100.0
─────────────────────────────────────────────────────────
After descriptive statistics, we continue with building and fitting of the regression model based on our hypotheses.
The model has one independent continuous variable - PHQ8. The only other continuous variable in the model is q02_age, which is inputted as a covariate. The rest of the variables are either categorical (both nominal and ordinal) or binary. The linreg() function from the jmv package automatically handles them as dummy variables with reference levels and it is thus not necessary to create further dummy variables prior to this analysis.
As a first step in the regression model creation, we conduct a correlation analysis. Since we do not presume linearity between all of the variables, we use Spearman’s rank coefficient instead of Pearson’s r. The results below need to be interpreted with caution, since some of the variables are categorical (such as q03_relationship_type), without a defined order. For categorical variables, comparisons using Chi-Square test would be more appropriate, however, in this step, we are primarily looking at the relationship between the outcome (PHQ8) and the theorized predictors. Statistically non-significant correlations (p > 0.05) are crossed out in the correlation matrix.
# While the dataset has been already imported, the values of factor variables
# were changed from numerics to text strings, therefore that dataset is unsuitable
# for correlation analysis. To solve this, we create a parallel dataset,
# again renaming the key variables to a more understandable form.
data_corr <- data_corr %>%
transmute(q01_gender = q01,
q02_age = q02,
q03_relationship_type = q03,
q04_children = q04,
q11_education = q11,
q18_02_soc_media = replace_na(q18_02, 0),
q20_public_info = q20,
q34_02_face_mask = q34_02,
q34_07_hand_washing = q34_07,
q36_econ_worry = q36,
q35_01_contact_close_family = q35_01,
q35_03_contact_friends = q35_03,
q38_alcohol = q38,
q40_smoking = q40,
q42_sport = q42,
q47_self_reporting_health = q47,
q48_chronic_illness = q48,
q49_health_limitations = q49)
data_corr <- cbind(data_corr, PHQ8_t)
res1 <- cor.mtest(data_corr, conf.level = .95)
#Correlation matrix using Spearman coefficient (values with p>0.05 are crossed)
corrplot(cor(data_corr,
method = "spearman",
use = "complete.obs"),
method = "circle",
title = "Correlation Matrix - Spearman Coefficient",
type = "lower",
p.mat = res1$p,
sig.level = .05,
mar = c(0,0,1,0))
In the first set of models, we avoid potentially biased modifications, such as pairwise comparisons, which could lead to overfitting. Instead, we build four successive models in total (“blocks” in the syntax).
First model uses only the demographic characteristics as predictors. Second model adds the effect of the social media consumption, virus information, economic worries and hygienic measures. Third model adds lifestyle variables, such as alcohol, smoking, sport and social contacts. The fourth model further adds the variables related to self-rated health quality. The performance of each model could be seen in the output below.
linreg_theory <- jmv::linReg(
data = data,
dep = "PHQ8_t",
covs = "q02_age",
factors = vars("q01_gender",
"q03_relationship_type",
"q04_children",
"q11_education",
"q18_02_soc_media",
"q20_public_info",
"q34_02_face_mask",
"q34_07_hand_washing",
"q35_01_contact_close_family",
"q35_03_contact_friends",
"q36_econ_worry",
"q38_alcohol",
"q40_smoking",
"q42_sport",
"q47_self_reporting_health",
"q48_chronic_illness",
"q49_health_limitations"),
blocks = list(
list(
"q01_gender",
"q02_age",
"q03_relationship_type",
"q04_children",
"q11_education"),
list(
"q18_02_soc_media",
"q20_public_info",
"q34_02_face_mask",
"q34_07_hand_washing",
"q36_econ_worry"),
list(
"q40_smoking",
"q42_sport",
"q38_alcohol",
"q35_01_contact_close_family",
"q35_03_contact_friends"),
list(
"q47_self_reporting_health",
"q48_chronic_illness",
"q49_health_limitations")),
refLevels = list(
list(
var = "q01_gender",
ref = "female"),
list(
var = "q04_children",
ref = "no"),
list(
var = "q20_public_info",
ref = "no"),
list(
var = "q34_02_face_mask",
ref = "no"),
list(
var = "q34_07_hand_washing",
ref = "no"),
list(
var = "q36_econ_worry",
ref = "very_serious"),
list(
var = "q42_sport",
ref = "no"),
list(
var = "q40_smoking",
ref = "yes"),
list(
var = "q38_alcohol",
ref = "yes"),
list(
var = "q35_01_contact_close_family",
ref = "less_often"),
list(
var = "q35_03_contact_friends",
ref = "less_often"),
list(
var = "q18_02_soc_media",
ref = "yes"),
list(
var = "q03_relationship_type",
ref = "single"),
list(
var = "q47_self_reporting_health",
ref = "very_bad"),
list(
var = "q49_health_limitations",
ref = "limits"),
list(
var = "q11_education",
ref = "unfin_element"),
list(
var = "q48_chronic_illness",
ref = "yes")),
r2Adj = TRUE,
aic = TRUE,
bic = TRUE,
rmse = TRUE,
modelTest = TRUE,
anova = TRUE,
ci = TRUE,
stdEst = TRUE,
ciStdEst = TRUE,
durbin = TRUE,
collin = TRUE)
linreg_theory$modelFit
Model Fit Measures
────────────────────────────────────────────────────────────────────────────────────────────────────
Model R R² Adjusted R² AIC BIC RMSE F df1 df2 p
────────────────────────────────────────────────────────────────────────────────────────────────────
1 0.357 0.128 0.120 4256 4330 1.027 17.7 12 1449 < .001
2 0.406 0.165 0.154 4207 4318 1.005 14.9 19 1442 < .001
3 0.417 0.174 0.159 4205 4353 1.000 11.6 26 1435 < .001
4 0.509 0.259 0.242 4059 4244 0.947 15.2 33 1428 < .001
────────────────────────────────────────────────────────────────────────────────────────────────────
linreg_theory$modelComp
Model Comparisons
────────────────────────────────────────────────────────────────────
Model Model ΔR² F df1 df2 p
────────────────────────────────────────────────────────────────────
1 - 2 0.03697 9.12 7 1442 < .001
2 - 3 0.00917 2.28 7 1435 0.026
3 - 4 0.08564 23.59 7 1428 < .001
────────────────────────────────────────────────────────────────────
linreg_theory$models
MODEL SPECIFIC RESULTS
MODEL 1
Omnibus ANOVA Test
──────────────────────────────────────────────────────────────────────────────────────
Sum of Squares df Mean Square F p
──────────────────────────────────────────────────────────────────────────────────────
q01_gender 22.83 1 22.826 21.430 < .001
q02_age 59.72 1 59.721 56.067 < .001
q03_relationship_type 13.37 4 3.344 3.139 0.014
q04_children 3.95 1 3.953 3.711 0.054
q11_education 2.98 5 0.595 0.559 0.732
Residuals 1543.43 1449 1.065
──────────────────────────────────────────────────────────────────────────────────────
Note. Type 3 sum of squares
Model Coefficients - PHQ8_t
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Predictor Estimate SE Lower Upper t p Stand. Estimate Lower Upper
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Intercept ᵃ 2.2721 0.47370 1.3429 3.20130 4.796 < .001
q01_gender:
male – female -0.2879 0.06219 -0.4099 -0.16589 -4.629 < .001 -0.262 -0.372 -0.15075
q02_age -0.0155 0.00207 -0.0196 -0.01144 -7.488 < .001 -0.290 -0.366 -0.21396
q03_relationship_type:
relationship – single -0.1228 0.08942 -0.2982 0.05263 -1.373 0.170 -0.112 -0.271 0.04783
married – single -0.1300 0.10580 -0.3375 0.07754 -1.229 0.219 -0.118 -0.307 0.07046
divorced – single 0.1302 0.13185 -0.1284 0.38886 0.988 0.323 0.118 -0.117 0.35338
widowed – single 0.1508 0.14748 -0.1385 0.44011 1.023 0.307 0.137 -0.126 0.39995
q04_children:
yes – no -0.1725 0.08955 -0.3482 0.00315 -1.926 0.054 -0.157 -0.316 0.00286
q11_education:
element – unfin_element 0.4151 0.47546 -0.5175 1.34777 0.873 0.383 0.377 -0.470 1.22479
unfin_hs – unfin_element 0.3657 0.47904 -0.5740 1.30538 0.763 0.445 0.332 -0.522 1.18627
hs – unfin_element 0.3700 0.46488 -0.5419 1.28188 0.796 0.426 0.336 -0.492 1.16491
undergrad – unfin_element 0.2948 0.47100 -0.6291 1.21870 0.626 0.531 0.268 -0.572 1.10750
postgrad – unfin_element 0.2931 0.46415 -0.6174 1.20357 0.631 0.528 0.266 -0.561 1.09375
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
ᵃ Represents reference level
ASSUMPTION CHECKS
Durbin–Watson Test for Autocorrelation
────────────────────────────────────────────
Autocorrelation DW Statistic p
────────────────────────────────────────────
0.0360 1.93 0.118
────────────────────────────────────────────
Collinearity Statistics
──────────────────────────────────────────────
VIF Tolerance
──────────────────────────────────────────────
q01_gender 1.05 0.955
q02_age 1.58 0.634
q03_relationship_type 1.17 0.856
q04_children 1.60 0.625
q11_education 1.04 0.965
──────────────────────────────────────────────
MODEL 2
Omnibus ANOVA Test
──────────────────────────────────────────────────────────────────────────────────────
Sum of Squares df Mean Square F p
──────────────────────────────────────────────────────────────────────────────────────
q01_gender 20.358 1 20.358 19.862 < .001
q02_age 43.639 1 43.639 42.575 < .001
q03_relationship_type 12.964 4 3.241 3.162 0.013
q04_children 3.285 1 3.285 3.205 0.074
q11_education 2.503 5 0.501 0.488 0.785
q18_02_soc_media 8.294 1 8.294 8.092 0.005
q20_public_info 9.429 2 4.715 4.600 0.010
q34_02_face_mask 2.377 1 2.377 2.319 0.128
q34_07_hand_washing 0.542 1 0.542 0.528 0.467
q36_econ_worry 40.325 2 20.162 19.671 < .001
Residuals 1478.031 1442 1.025
──────────────────────────────────────────────────────────────────────────────────────
Note. Type 3 sum of squares
Model Coefficients - PHQ8_t
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Predictor Estimate SE Lower Upper t p Stand. Estimate Lower Upper
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Intercept ᵃ 2.3077 0.50752 1.3121 3.30325 4.547 < .001
q01_gender:
male – female -0.2729 0.06124 -0.3931 -0.15280 -4.457 < .001 -0.2480 -0.3572 -0.13886
q02_age -0.0138 0.00212 -0.0180 -0.00965 -6.525 < .001 -0.2582 -0.3358 -0.18055
q03_relationship_type:
relationship – single -0.1783 0.08826 -0.3514 -0.00515 -2.020 0.044 -0.1620 -0.3193 -0.00468
married – single -0.1535 0.10428 -0.3580 0.05106 -1.472 0.141 -0.1395 -0.3254 0.04640
divorced – single 0.0420 0.12998 -0.2130 0.29695 0.323 0.747 0.0381 -0.1936 0.26985
widowed – single 0.1310 0.14505 -0.1536 0.41548 0.903 0.367 0.1190 -0.1396 0.37757
q04_children:
yes – no -0.1578 0.08815 -0.3307 0.01512 -1.790 0.074 -0.1434 -0.3005 0.01374
q11_education:
element – unfin_element 0.5339 0.46709 -0.3823 1.45017 1.143 0.253 0.4852 -0.3474 1.31784
unfin_hs – unfin_element 0.4532 0.47044 -0.4696 1.37604 0.963 0.336 0.4119 -0.4268 1.25048
hs – unfin_element 0.5047 0.45649 -0.3907 1.40017 1.106 0.269 0.4587 -0.3551 1.27241
undergrad – unfin_element 0.4180 0.46260 -0.4894 1.32547 0.904 0.366 0.3799 -0.4447 1.20452
postgrad – unfin_element 0.4604 0.45598 -0.4340 1.35489 1.010 0.313 0.4184 -0.3944 1.23126
q18_02_soc_media:
no – yes -0.1812 0.06369 -0.3061 -0.05624 -2.845 0.005 -0.1646 -0.2782 -0.05111
q20_public_info:
yes – no -0.2323 0.07753 -0.3844 -0.08020 -2.996 0.003 -0.2111 -0.3493 -0.07288
do_not_know – no -0.2292 0.10812 -0.4413 -0.01713 -2.120 0.034 -0.2083 -0.4010 -0.01556
q34_02_face_mask:
yes – no 0.2439 0.16019 -0.0703 0.55817 1.523 0.128 0.2217 -0.0639 0.50724
q34_07_hand_washing:
yes – no 0.0987 0.13574 -0.1676 0.36495 0.727 0.467 0.0897 -0.1523 0.33165
q36_econ_worry:
serious – very_serious -0.1907 0.07227 -0.3325 -0.04898 -2.639 0.008 -0.1733 -0.3022 -0.04451
limited – very_serious -0.4514 0.07541 -0.5994 -0.30351 -5.986 < .001 -0.4103 -0.5447 -0.27582
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
ᵃ Represents reference level
ASSUMPTION CHECKS
Durbin–Watson Test for Autocorrelation
────────────────────────────────────────────
Autocorrelation DW Statistic p
────────────────────────────────────────────
0.0412 1.91 0.084
────────────────────────────────────────────
Collinearity Statistics
──────────────────────────────────────────────
VIF Tolerance
──────────────────────────────────────────────
q01_gender 1.05 0.951
q02_age 1.64 0.608
q03_relationship_type 1.17 0.852
q04_children 1.61 0.623
q11_education 1.04 0.961
q18_02_soc_media 1.08 0.927
q20_public_info 1.03 0.975
q34_02_face_mask 1.01 0.990
q34_07_hand_washing 1.03 0.975
q36_econ_worry 1.01 0.990
──────────────────────────────────────────────
MODEL 3
Omnibus ANOVA Test
────────────────────────────────────────────────────────────────────────────────────────────
Sum of Squares df Mean Square F p
────────────────────────────────────────────────────────────────────────────────────────────
q01_gender 21.919 1 21.919 21.517 < .001
q02_age 40.637 1 40.637 39.892 < .001
q03_relationship_type 12.946 4 3.236 3.177 0.013
q04_children 3.489 1 3.489 3.426 0.064
q11_education 2.328 5 0.466 0.457 0.808
q18_02_soc_media 7.051 1 7.051 6.922 0.009
q20_public_info 10.215 2 5.108 5.014 0.007
q34_02_face_mask 2.715 1 2.715 2.665 0.103
q34_07_hand_washing 0.830 1 0.830 0.815 0.367
q36_econ_worry 39.907 2 19.953 19.587 < .001
q40_smoking 1.571 1 1.571 1.542 0.215
q42_sport 8.035 1 8.035 7.887 0.005
q38_alcohol 0.607 1 0.607 0.596 0.440
q35_01_contact_close_family 4.022 2 2.011 1.974 0.139
q35_03_contact_friends 0.775 2 0.388 0.381 0.683
Residuals 1461.804 1435 1.019
────────────────────────────────────────────────────────────────────────────────────────────
Note. Type 3 sum of squares
Model Coefficients - PHQ8_t
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Predictor Estimate SE Lower Upper t p Stand. Estimate Lower Upper
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Intercept ᵃ 2.3665 0.51679 1.35274 3.38023 4.579 < .001
q01_gender:
male – female -0.2869 0.06185 -0.40823 -0.16558 -4.639 < .001 -0.2607 -0.37098 -0.15047
q02_age -0.0137 0.00216 -0.01790 -0.00941 -6.316 < .001 -0.2553 -0.33465 -0.17604
q03_relationship_type:
relationship – single -0.1725 0.08831 -0.34572 7.62e-4 -1.953 0.051 -0.1567 -0.31417 6.92e-4
married – single -0.1387 0.10417 -0.34303 0.06565 -1.331 0.183 -0.1260 -0.31173 0.05966
divorced – single 0.0678 0.12999 -0.18722 0.32275 0.521 0.602 0.0616 -0.17014 0.29330
widowed – single 0.1426 0.14522 -0.14222 0.42750 0.982 0.326 0.1296 -0.12925 0.38849
q04_children:
yes – no -0.1636 0.08841 -0.33705 0.00980 -1.851 0.064 -0.1487 -0.30630 0.00890
q11_education:
element – unfin_element 0.5551 0.46717 -0.36135 1.47147 1.188 0.235 0.5044 -0.32838 1.33720
unfin_hs – unfin_element 0.4605 0.47001 -0.46151 1.38245 0.980 0.327 0.4185 -0.41940 1.25630
hs – unfin_element 0.5446 0.45690 -0.35161 1.44090 1.192 0.233 0.4949 -0.31953 1.30942
undergrad – unfin_element 0.4732 0.46336 -0.43569 1.38217 1.021 0.307 0.4301 -0.39594 1.25605
postgrad – unfin_element 0.5173 0.45670 -0.37862 1.41314 1.133 0.258 0.4701 -0.34407 1.28419
q18_02_soc_media:
no – yes -0.1682 0.06393 -0.29363 -0.04279 -2.631 0.009 -0.1529 -0.26683 -0.03889
q20_public_info:
yes – no -0.2408 0.07750 -0.39281 -0.08877 -3.107 0.002 -0.2188 -0.35697 -0.08067
do_not_know – no -0.2487 0.10815 -0.46090 -0.03659 -2.300 0.022 -0.2260 -0.41884 -0.03325
q34_02_face_mask:
yes – no 0.2619 0.16044 -0.05282 0.57664 1.632 0.103 0.2380 -0.04800 0.52402
q34_07_hand_washing:
yes – no 0.1229 0.13616 -0.14419 0.39000 0.903 0.367 0.1117 -0.13103 0.35442
q36_econ_worry:
serious – very_serious -0.1820 0.07233 -0.32384 -0.04008 -2.516 0.012 -0.1654 -0.29429 -0.03643
limited – very_serious -0.4475 0.07548 -0.59561 -0.29948 -5.929 < .001 -0.4067 -0.54126 -0.27216
q40_smoking:
no – yes -0.0986 0.07940 -0.25435 0.05717 -1.242 0.215 -0.0896 -0.23114 0.05195
q42_sport:
yes – no -0.1563 0.05565 -0.26545 -0.04712 -2.808 0.005 -0.1420 -0.24123 -0.04282
q38_alcohol:
no – yes -0.0440 0.05697 -0.15573 0.06778 -0.772 0.440 -0.0400 -0.14152 0.06159
q35_01_contact_close_family:
as_before – less_often 0.0138 0.05978 -0.10347 0.13105 0.231 0.818 0.0125 -0.09403 0.11909
more_often – less_often 0.1752 0.09034 -0.00203 0.35241 1.939 0.053 0.1592 -0.00184 0.32025
q35_03_contact_friends:
as_before – less_often 0.0438 0.14224 -0.23519 0.32284 0.308 0.758 0.0398 -0.21373 0.29338
more_often – less_often 0.2690 0.32530 -0.36913 0.90709 0.827 0.408 0.2444 -0.33545 0.82432
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
ᵃ Represents reference level
ASSUMPTION CHECKS
Durbin–Watson Test for Autocorrelation
────────────────────────────────────────────
Autocorrelation DW Statistic p
────────────────────────────────────────────
0.0344 1.93 0.132
────────────────────────────────────────────
Collinearity Statistics
────────────────────────────────────────────────────
VIF Tolerance
────────────────────────────────────────────────────
q01_gender 1.07 0.939
q02_age 1.68 0.594
q03_relationship_type 1.18 0.849
q04_children 1.62 0.619
q11_education 1.05 0.952
q18_02_soc_media 1.09 0.920
q20_public_info 1.03 0.972
q34_02_face_mask 1.02 0.985
q34_07_hand_washing 1.03 0.970
q36_econ_worry 1.01 0.987
q40_smoking 1.03 0.972
q42_sport 1.04 0.957
q38_alcohol 1.05 0.952
q35_01_contact_close_family 1.05 0.953
q35_03_contact_friends 1.02 0.976
────────────────────────────────────────────────────
MODEL 4
Omnibus ANOVA Test
─────────────────────────────────────────────────────────────────────────────────────────────
Sum of Squares df Mean Square F p
─────────────────────────────────────────────────────────────────────────────────────────────
q01_gender 27.2954 1 27.2954 29.7472 < .001
q02_age 70.3096 1 70.3096 76.6254 < .001
q03_relationship_type 11.4670 4 2.8668 3.1243 0.014
q04_children 1.8411 1 1.8411 2.0065 0.157
q11_education 5.7255 5 1.1451 1.2480 0.284
q18_02_soc_media 5.9957 1 5.9957 6.5343 0.011
q20_public_info 8.7036 2 4.3518 4.7427 0.009
q34_02_face_mask 2.3673 1 2.3673 2.5800 0.108
q34_07_hand_washing 0.7288 1 0.7288 0.7942 0.373
q36_econ_worry 26.8993 2 13.4497 14.6578 < .001
q40_smoking 1.3333 1 1.3333 1.4531 0.228
q42_sport 0.0893 1 0.0893 0.0973 0.755
q38_alcohol 5.1848 1 5.1848 5.6506 0.018
q35_01_contact_close_family 3.7106 2 1.8553 2.0220 0.133
q35_03_contact_friends 1.1510 2 0.5755 0.6272 0.534
q47_self_reporting_health 56.4257 4 14.1064 15.3736 < .001
q48_chronic_illness 4.1183 1 4.1183 4.4882 0.034
q49_health_limitations 9.9802 2 4.9901 5.4384 0.004
Residuals 1310.2988 1428 0.9176
─────────────────────────────────────────────────────────────────────────────────────────────
Note. Type 3 sum of squares
Model Coefficients - PHQ8_t
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Predictor Estimate SE Lower Upper t p Stand. Estimate Lower Upper
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Intercept ᵃ 3.61048 0.63562 2.36364 4.85733 5.680 < .001
q01_gender:
male – female -0.32062 0.05879 -0.43594 -0.20531 -5.454 < .001 -0.29137 -0.39616 -0.18657
q02_age -0.01839 0.00210 -0.02251 -0.01427 -8.754 < .001 -0.34394 -0.42101 -0.26686
q03_relationship_type:
relationship – single -0.20027 0.08407 -0.36519 -0.03535 -2.382 0.017 -0.18200 -0.33187 -0.03212
married – single -0.16564 0.09903 -0.35990 0.02861 -1.673 0.095 -0.15053 -0.32706 0.02600
divorced – single 0.03575 0.12361 -0.20672 0.27822 0.289 0.772 0.03249 -0.18786 0.25283
widowed – single 0.03203 0.13851 -0.23968 0.30374 0.231 0.817 0.02911 -0.21781 0.27602
q04_children:
yes – no -0.11911 0.08409 -0.28407 0.04584 -1.417 0.157 -0.10824 -0.25814 0.04166
q11_education:
element – unfin_element 0.63075 0.44456 -0.24130 1.50280 1.419 0.156 0.57320 -0.21928 1.36567
unfin_hs – unfin_element 0.45938 0.44721 -0.41787 1.33663 1.027 0.304 0.41746 -0.37974 1.21467
hs – unfin_element 0.69132 0.43536 -0.16269 1.54532 1.588 0.113 0.62824 -0.14785 1.40432
undergrad – unfin_element 0.63355 0.44147 -0.23246 1.49955 1.435 0.151 0.57574 -0.21125 1.36272
postgrad – unfin_element 0.68915 0.43520 -0.16455 1.54285 1.584 0.114 0.62627 -0.14953 1.40207
q18_02_soc_media:
no – yes -0.15527 0.06074 -0.27443 -0.03612 -2.556 0.011 -0.14111 -0.24939 -0.03282
q20_public_info:
yes – no -0.22268 0.07396 -0.36776 -0.07759 -3.011 0.003 -0.20236 -0.33421 -0.07051
do_not_know – no -0.23478 0.10299 -0.43682 -0.03275 -2.280 0.023 -0.21336 -0.39696 -0.02976
q34_02_face_mask:
yes – no 0.24485 0.15244 -0.05418 0.54388 1.606 0.108 0.22251 -0.04923 0.49426
q34_07_hand_washing:
yes – no 0.11542 0.12951 -0.13863 0.36946 0.891 0.373 0.10488 -0.12598 0.33575
q36_econ_worry:
serious – very_serious -0.16588 0.06886 -0.30096 -0.03081 -2.409 0.016 -0.15075 -0.27349 -0.02800
limited – very_serious -0.37513 0.07203 -0.51643 -0.23383 -5.208 < .001 -0.34090 -0.46931 -0.21250
q40_smoking:
no – yes -0.09143 0.07585 -0.24021 0.05735 -1.205 0.228 -0.08309 -0.21830 0.05212
q42_sport:
yes – no -0.01685 0.05400 -0.12278 0.08909 -0.312 0.755 -0.01531 -0.11158 0.08096
q38_alcohol:
no – yes -0.12974 0.05458 -0.23680 -0.02268 -2.377 0.018 -0.11790 -0.21519 -0.02061
q35_01_contact_close_family:
as_before – less_often 0.00738 0.05681 -0.10407 0.11882 0.130 0.897 0.00670 -0.09457 0.10798
more_often – less_often 0.16629 0.08587 -0.00216 0.33473 1.937 0.053 0.15111 -0.00196 0.30419
q35_03_contact_friends:
as_before – less_often 0.10946 0.13522 -0.15579 0.37471 0.809 0.418 0.09947 -0.14158 0.34051
more_often – less_often 0.24785 0.30925 -0.35878 0.85448 0.801 0.423 0.22523 -0.32604 0.77651
q47_self_reporting_health:
excellent – very_bad -1.14692 0.41295 -1.95697 -0.33688 -2.777 0.006 -1.04227 -1.77840 -0.30614
good – very_bad -0.90110 0.40847 -1.70235 -0.09984 -2.206 0.028 -0.81887 -1.54702 -0.09073
neutral – very_bad -0.59178 0.40765 -1.39143 0.20787 -1.452 0.147 -0.53778 -1.26447 0.18891
bad – very_bad -0.08346 0.41163 -0.89091 0.72400 -0.203 0.839 -0.07584 -0.80962 0.65794
q48_chronic_illness:
no – yes -0.13069 0.06169 -0.25169 -0.00968 -2.119 0.034 -0.11876 -0.22873 -0.00880
q49_health_limitations:
partially_limits – limits -0.20391 0.14795 -0.49414 0.08631 -1.378 0.168 -0.18531 -0.44905 0.07843
no_limits – limits -0.38518 0.15172 -0.68279 -0.08757 -2.539 0.011 -0.35003 -0.62049 -0.07958
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
ᵃ Represents reference level
ASSUMPTION CHECKS
Durbin–Watson Test for Autocorrelation
────────────────────────────────────────────
Autocorrelation DW Statistic p
────────────────────────────────────────────
0.0348 1.93 0.150
────────────────────────────────────────────
Collinearity Statistics
────────────────────────────────────────────────────
VIF Tolerance
────────────────────────────────────────────────────
q01_gender 1.07 0.937
q02_age 1.73 0.580
q03_relationship_type 1.18 0.847
q04_children 1.62 0.618
q11_education 1.06 0.946
q18_02_soc_media 1.09 0.919
q20_public_info 1.03 0.968
q34_02_face_mask 1.02 0.984
q34_07_hand_washing 1.03 0.967
q36_econ_worry 1.02 0.983
q40_smoking 1.04 0.965
q42_sport 1.07 0.936
q38_alcohol 1.06 0.944
q35_01_contact_close_family 1.05 0.952
q35_03_contact_friends 1.03 0.974
q47_self_reporting_health 1.09 0.914
q48_chronic_illness 1.21 0.824
q49_health_limitations 1.19 0.841
────────────────────────────────────────────────────
As an alternative approach to the theory-derived, inductively build set of models, we choose to use the stepwise regression - combining forward with stepwise selection of the predictors. By using both of the Akaike information criterion (AIC) and Bayesian information criterion (BIC) to select the best-performing model, the algorithm from the MASS package arrives at two simpler models, compared to the 18 predictor variables selected with the previous manual approach. However, while these two models perform well with this particular sample, there is a significant chance of underperformance on the international sample, since stepwise regression is prone to overfitting.
Using AIC-ranked stepwise selection, the algorithm arrives at 13-predictor model and with BIC-ranked selection at 7-predictor model.
In order to allow direct comparison with the manually-selected model, we input the chosen models (based on the AIC and BIC criterion) from the previous step into the linreg() function of the jmv package. The first, simpler model 1 has the 7 predictors from the BIC-selected model. The model 2, has 6 additional variables from AIC-selected stepwise model (to a total of 13).
# We are using the MASS package, which contains stepAIC() function for stepwise
# regression model selection. We again filter the dataset to only the variables
# specified with hypotheses
linreg_stepwise <- data %>% dplyr::select(-c(id,
q02_age_group,
q30_concern_infection_covid,
q31_concern_infection_friends,
q33_01_concern_situation,
q33_02_concern_low_control,
q33_03_concern_survival_covid,
q33_04_concern_change_employment,
q33_05_concern_infecting_others,
q50_comment,
PHQ8))
# Fit the full linear model using lm() function from base R
full.model_MASS <- lm(PHQ8_t ~.,
data = linreg_stepwise,
na.action = na.omit)
# Stepwise regression model using MASS package, ranks on AIC
step.model_AIC <- stepAIC(full.model_MASS,
direction = "both",
trace = FALSE)
# Stepwise regression model using MASS package, ranks on BIC
step.model_BIC <- stepAIC(full.model_MASS,
direction = "both",
trace = FALSE,
k = log(nrow(linreg_stepwise)))
# To construct this regression model, we use the linReg()
# function from the jmv package.
linreg_stepwise2 <- jmv::linReg(
data = data,
dep = "PHQ8_t",
covs = "q02_age",
factors = vars("q01_gender",
"q03_relationship_type",
"q04_children",
"q18_02_soc_media",
"q20_public_info",
"q34_02_face_mask",
"q36_econ_worry",
"q38_alcohol",
"q40_smoking",
"q47_self_reporting_health",
"q48_chronic_illness",
"q49_health_limitations"),
blocks = list(
list(
"q01_gender",
"q02_age",
"q04_children",
"q36_econ_worry",
"q18_02_soc_media",
"q47_self_reporting_health",
"q49_health_limitations"),
list(
"q03_relationship_type",
"q20_public_info",
"q34_02_face_mask",
"q38_alcohol",
"q40_smoking",
"q48_chronic_illness")),
refLevels = list(
list(
var = "q01_gender",
ref = "female"),
list(
var = "q04_children",
ref = "no"),
list(
var = "q20_public_info",
ref = "no"),
list(
var = "q34_02_face_mask",
ref = "no"),
list(
var = "q36_econ_worry",
ref = "very_serious"),
list(
var = "q40_smoking",
ref = "yes"),
list(
var = "q38_alcohol",
ref = "yes"),
list(
var = "q18_02_soc_media",
ref = "yes"),
list(
var = "q03_relationship_type",
ref = "single"),
list(
var = "q47_self_reporting_health",
ref = "very_bad"),
list(
var = "q49_health_limitations",
ref = "limits"),
list(
var = "q48_chronic_illness",
ref = "yes")),
r2Adj = TRUE,
aic = TRUE,
bic = TRUE,
rmse = TRUE,
modelTest = TRUE,
anova = TRUE,
ci = TRUE,
stdEst = TRUE,
ciStdEst = TRUE,
durbin = TRUE,
collin = TRUE)
base::summary(step.model_AIC)
Call:
lm(formula = PHQ8_t ~ q01_gender + q02_age + q03_relationship_type +
q04_children + q18_02_soc_media + q20_public_info + q34_02_face_mask +
q35_01_contact_close_family + q36_econ_worry + q38_alcohol +
q47_self_reporting_health + q48_chronic_illness + q49_health_limitations,
data = linreg_stepwise, na.action = na.omit)
Residuals:
Min 1Q Median 3Q Max
-2.9052 -0.7074 0.0472 0.6444 2.7508
Coefficients:
Estimate Std. Error t value
(Intercept) 2.84748 0.21803 13.06
q01_gendermale -0.32702 0.05759 -5.68
q02_age -0.01817 0.00204 -8.89
q03_relationship_typerelationship -0.18714 0.08280 -2.26
q03_relationship_typemarried -0.16339 0.09766 -1.67
q03_relationship_typedivorced 0.04996 0.12296 0.41
q03_relationship_typewidowed 0.02361 0.13805 0.17
q04_childrenno 0.12356 0.08343 1.48
q18_02_soc_mediayes 0.16279 0.06019 2.70
q20_public_infono 0.21047 0.07341 2.87
q20_public_infodo_not_know -0.02529 0.08399 -0.30
q34_02_face_maskno -0.24507 0.15083 -1.62
q35_01_contact_close_familyas_before 0.01616 0.05602 0.29
q35_01_contact_close_familymore_often 0.16620 0.08491 1.96
q36_econ_worryserious -0.15839 0.06849 -2.31
q36_econ_worrylimited -0.36619 0.07162 -5.11
q38_alcoholno -0.14346 0.05368 -2.67
q47_self_reporting_healthgood 0.24473 0.06673 3.67
q47_self_reporting_healthneutral 0.55329 0.08851 6.25
q47_self_reporting_healthbad 1.05372 0.15424 6.83
q47_self_reporting_healthvery_bad 1.13657 0.41183 2.76
q48_chronic_illnessno -0.13221 0.06128 -2.16
q49_health_limitationspartially_limits -0.15791 0.14579 -1.08
q49_health_limitationsno_limits -0.34103 0.15000 -2.27
Pr(>|t|)
(Intercept) < 0.0000000000000002 ***
q01_gendermale 0.000000016401 ***
q02_age < 0.0000000000000002 ***
q03_relationship_typerelationship 0.02395 *
q03_relationship_typemarried 0.09451 .
q03_relationship_typedivorced 0.68457
q03_relationship_typewidowed 0.86422
q04_childrenno 0.13882
q18_02_soc_mediayes 0.00692 **
q20_public_infono 0.00420 **
q20_public_infodo_not_know 0.76336
q34_02_face_maskno 0.10443
q35_01_contact_close_familyas_before 0.77295
q35_01_contact_close_familymore_often 0.05050 .
q36_econ_worryserious 0.02089 *
q36_econ_worrylimited 0.000000359690 ***
q38_alcoholno 0.00761 **
q47_self_reporting_healthgood 0.00025 ***
q47_self_reporting_healthneutral 0.000000000536 ***
q47_self_reporting_healthbad 0.000000000012 ***
q47_self_reporting_healthvery_bad 0.00586 **
q48_chronic_illnessno 0.03113 *
q49_health_limitationspartially_limits 0.27895
q49_health_limitationsno_limits 0.02314 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.958 on 1438 degrees of freedom
(22 observations deleted due to missingness)
Multiple R-squared: 0.254, Adjusted R-squared: 0.242
F-statistic: 21.3 on 23 and 1438 DF, p-value: <0.0000000000000002
base::summary(step.model_BIC)
Call:
lm(formula = PHQ8_t ~ q01_gender + q02_age + q04_children + q18_02_soc_media +
q36_econ_worry + q47_self_reporting_health + q49_health_limitations,
data = linreg_stepwise, na.action = na.omit)
Residuals:
Min 1Q Median 3Q Max
-3.106 -0.713 0.049 0.698 2.558
Coefficients:
Estimate Std. Error t value
(Intercept) 2.60650 0.19349 13.47
q01_gendermale -0.30262 0.05641 -5.36
q02_age -0.01776 0.00173 -10.27
q04_childrenno 0.21821 0.06837 3.19
q18_02_soc_mediayes 0.16458 0.06041 2.72
q36_econ_worryserious -0.18491 0.06857 -2.70
q36_econ_worrylimited -0.38561 0.07158 -5.39
q47_self_reporting_healthgood 0.27719 0.06636 4.18
q47_self_reporting_healthneutral 0.60516 0.08602 7.04
q47_self_reporting_healthbad 1.09835 0.15190 7.23
q47_self_reporting_healthvery_bad 1.17617 0.41258 2.85
q49_health_limitationspartially_limits -0.13872 0.14613 -0.95
q49_health_limitationsno_limits -0.36459 0.14941 -2.44
Pr(>|t|)
(Intercept) < 0.0000000000000002 ***
q01_gendermale 0.00000009434033 ***
q02_age < 0.0000000000000002 ***
q04_childrenno 0.0014 **
q18_02_soc_mediayes 0.0065 **
q36_econ_worryserious 0.0071 **
q36_econ_worrylimited 0.00000008360844 ***
q47_self_reporting_healthgood 0.00003126193214 ***
q47_self_reporting_healthneutral 0.00000000000305 ***
q47_self_reporting_healthbad 0.00000000000077 ***
q47_self_reporting_healthvery_bad 0.0044 **
q49_health_limitationspartially_limits 0.3426
q49_health_limitationsno_limits 0.0148 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.966 on 1449 degrees of freedom
(22 observations deleted due to missingness)
Multiple R-squared: 0.235, Adjusted R-squared: 0.229
F-statistic: 37.2 on 12 and 1449 DF, p-value: <0.0000000000000002
linreg_stepwise2$modelFit
Model Fit Measures
────────────────────────────────────────────────────────────────────────────────────────────────────
Model R R² Adjusted R² AIC BIC RMSE F df1 df2 p
────────────────────────────────────────────────────────────────────────────────────────────────────
1 0.485 0.235 0.229 4064 4138 0.962 37.2 12 1449 < .001
2 0.503 0.253 0.242 4049 4176 0.951 22.2 22 1439 < .001
────────────────────────────────────────────────────────────────────────────────────────────────────
linreg_stepwise2$modelComp
Model Comparisons
──────────────────────────────────────────────────────────────────
Model Model ΔR² F df1 df2 p
──────────────────────────────────────────────────────────────────
1 - 2 0.0179 3.45 10 1439 < .001
──────────────────────────────────────────────────────────────────
linreg_stepwise2$models
MODEL SPECIFIC RESULTS
MODEL 1
Omnibus ANOVA Test
──────────────────────────────────────────────────────────────────────────────────────────
Sum of Squares df Mean Square F p
──────────────────────────────────────────────────────────────────────────────────────────
q01_gender 26.87 1 26.869 28.78 < .001
q02_age 98.57 1 98.568 105.57 < .001
q04_children 9.51 1 9.511 10.19 0.001
q36_econ_worry 28.47 2 14.237 15.25 < .001
q18_02_soc_media 6.93 1 6.930 7.42 0.007
q47_self_reporting_health 71.59 4 17.896 19.17 < .001
q49_health_limitations 14.07 2 7.034 7.53 < .001
Residuals 1352.85 1449 0.934
──────────────────────────────────────────────────────────────────────────────────────────
Note. Type 3 sum of squares
Model Coefficients - PHQ8_t
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Predictor Estimate SE Lower Upper t p Stand. Estimate Lower Upper
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Intercept ᵃ 4.1655 0.41074 3.3597 4.9712 10.141 < .001
q01_gender:
male – female -0.3026 0.05641 -0.4133 -0.1920 -5.365 < .001 -0.2750 -0.376 -0.1745
q02_age -0.0178 0.00173 -0.0211 -0.0144 -10.275 < .001 -0.3321 -0.395 -0.2687
q04_children:
yes – no -0.2182 0.06837 -0.3523 -0.0841 -3.192 0.001 -0.1983 -0.320 -0.0764
q36_econ_worry:
serious – very_serious -0.1849 0.06857 -0.3194 -0.0504 -2.697 0.007 -0.1680 -0.290 -0.0458
limited – very_serious -0.3856 0.07158 -0.5260 -0.2452 -5.387 < .001 -0.3504 -0.478 -0.2228
q18_02_soc_media:
no – yes -0.1646 0.06041 -0.2831 -0.0461 -2.724 0.007 -0.1496 -0.257 -0.0419
q47_self_reporting_health:
excellent – very_bad -1.1762 0.41258 -1.9855 -0.3669 -2.851 0.004 -1.0688 -1.804 -0.3334
good – very_bad -0.8990 0.40919 -1.7016 -0.0963 -2.197 0.028 -0.8170 -1.546 -0.0875
neutral – very_bad -0.5710 0.40919 -1.3737 0.2316 -1.395 0.163 -0.5189 -1.248 0.2105
bad – very_bad -0.0778 0.41362 -0.8892 0.7335 -0.188 0.851 -0.0707 -0.808 0.6666
q49_health_limitations:
partially_limits – limits -0.1387 0.14613 -0.4254 0.1479 -0.949 0.343 -0.1261 -0.387 0.1344
no_limits – limits -0.3646 0.14941 -0.6577 -0.0715 -2.440 0.015 -0.3313 -0.598 -0.0650
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
ᵃ Represents reference level
ASSUMPTION CHECKS
Durbin–Watson Test for Autocorrelation
────────────────────────────────────────────
Autocorrelation DW Statistic p
────────────────────────────────────────────
0.0369 1.92 0.120
────────────────────────────────────────────
Collinearity Statistics
──────────────────────────────────────────────────
VIF Tolerance
──────────────────────────────────────────────────
q01_gender 1.02 0.985
q02_age 1.41 0.711
q04_children 1.30 0.766
q36_econ_worry 1.01 0.994
q18_02_soc_media 1.07 0.933
q47_self_reporting_health 1.07 0.934
q49_health_limitations 1.14 0.874
──────────────────────────────────────────────────
MODEL 2
Omnibus ANOVA Test
─────────────────────────────────────────────────────────────────────────────────────────
Sum of Squares df Mean Square F p
─────────────────────────────────────────────────────────────────────────────────────────
q01_gender 27.91 1 27.912 30.40 < .001
q02_age 82.25 1 82.249 89.58 < .001
q04_children 1.99 1 1.992 2.17 0.141
q36_econ_worry 25.77 2 12.887 14.04 < .001
q18_02_soc_media 5.93 1 5.925 6.45 0.011
q47_self_reporting_health 56.40 4 14.100 15.36 < .001
q49_health_limitations 9.45 2 4.727 5.15 0.006
q03_relationship_type 12.33 4 3.082 3.36 0.010
q20_public_info 7.61 2 3.804 4.14 0.016
q34_02_face_mask 2.28 1 2.282 2.49 0.115
q38_alcohol 5.96 1 5.964 6.50 0.011
q40_smoking 1.64 1 1.642 1.79 0.181
q48_chronic_illness 4.60 1 4.598 5.01 0.025
Residuals 1321.19 1439 0.918
─────────────────────────────────────────────────────────────────────────────────────────
Note. Type 3 sum of squares
Model Coefficients - PHQ8_t
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Predictor Estimate SE Lower Upper t p Stand. Estimate Lower Upper
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Intercept ᵃ 4.3849 0.44382 3.5143 5.2555 9.880 < .001
q01_gender:
male – female -0.3164 0.05738 -0.4289 -0.2038 -5.514 < .001 -0.2875 -0.3898 -0.1852
q02_age -0.0189 0.00200 -0.0228 -0.0150 -9.465 < .001 -0.3538 -0.4271 -0.2805
q04_children:
yes – no -0.1227 0.08333 -0.2862 0.0407 -1.473 0.141 -0.1115 -0.2601 0.0370
q36_econ_worry:
serious – very_serious -0.1555 0.06851 -0.2899 -0.0211 -2.270 0.023 -0.1413 -0.2635 -0.0192
limited – very_serious -0.3632 0.07161 -0.5037 -0.2228 -5.072 < .001 -0.3301 -0.4577 -0.2024
q18_02_soc_media:
no – yes -0.1534 0.06037 -0.2718 -0.0349 -2.540 0.011 -0.1394 -0.2470 -0.0318
q47_self_reporting_health:
excellent – very_bad -1.1430 0.41170 -1.9506 -0.3354 -2.776 0.006 -1.0387 -1.7726 -0.3048
good – very_bad -0.8974 0.40752 -1.6968 -0.0980 -2.202 0.028 -0.8155 -1.5420 -0.0890
neutral – very_bad -0.5974 0.40695 -1.3956 0.2009 -1.468 0.142 -0.5429 -1.2683 0.1826
bad – very_bad -0.1025 0.41099 -0.9087 0.7037 -0.249 0.803 -0.0932 -0.8258 0.6395
q49_health_limitations:
partially_limits – limits -0.1526 0.14589 -0.4388 0.1336 -1.046 0.296 -0.1387 -0.3987 0.1214
no_limits – limits -0.3400 0.15004 -0.6343 -0.0457 -2.266 0.024 -0.3090 -0.5765 -0.0415
q03_relationship_type:
relationship – single -0.2024 0.08245 -0.3641 -0.0407 -2.455 0.014 -0.1839 -0.3309 -0.0369
married – single -0.1680 0.09757 -0.3593 0.0234 -1.721 0.085 -0.1526 -0.3265 0.0213
divorced – single 0.0380 0.12281 -0.2029 0.2789 0.310 0.757 0.0346 -0.1844 0.2535
widowed – single 0.0302 0.13777 -0.2401 0.3004 0.219 0.827 0.0274 -0.2182 0.2730
q20_public_info:
yes – no -0.2055 0.07336 -0.3495 -0.0616 -2.802 0.005 -0.1868 -0.3176 -0.0560
do_not_know – no -0.2222 0.10208 -0.4224 -0.0219 -2.176 0.030 -0.2019 -0.3839 -0.0199
q34_02_face_mask:
yes – no 0.2378 0.15085 -0.0581 0.5337 1.577 0.115 0.2161 -0.0528 0.4850
q38_alcohol:
no – yes -0.1373 0.05386 -0.2429 -0.0316 -2.549 0.011 -0.1248 -0.2208 -0.0287
q40_smoking:
no – yes -0.1000 0.07475 -0.2466 0.0467 -1.337 0.181 -0.0909 -0.2241 0.0424
q48_chronic_illness:
no – yes -0.1375 0.06142 -0.2579 -0.0170 -2.238 0.025 -0.1249 -0.2344 -0.0154
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
ᵃ Represents reference level
ASSUMPTION CHECKS
Durbin–Watson Test for Autocorrelation
────────────────────────────────────────────
Autocorrelation DW Statistic p
────────────────────────────────────────────
0.0372 1.92 0.120
────────────────────────────────────────────
Collinearity Statistics
──────────────────────────────────────────────────
VIF Tolerance
──────────────────────────────────────────────────
q01_gender 1.04 0.960
q02_age 1.64 0.609
q04_children 1.60 0.624
q36_econ_worry 1.01 0.987
q18_02_soc_media 1.08 0.925
q47_self_reporting_health 1.09 0.920
q49_health_limitations 1.18 0.848
q03_relationship_type 1.17 0.856
q20_public_info 1.02 0.979
q34_02_face_mask 1.01 0.995
q38_alcohol 1.05 0.956
q40_smoking 1.02 0.980
q48_chronic_illness 1.21 0.828
──────────────────────────────────────────────────
Aside from the regression model, we intend to explore the mediating role of concern/anxiety between the consumption of social media and depression through a mediation/moderation analysis (in section 5).
Unlike as is in the case of PHQ-8 index as a measure of depression, this survey does not have a standardized measure of of Covid-19 concern or anxiety. We therefore try to proceed inductively, using Covid-19-related survey items that could represent the underlying construct.
Therefore, in this section, we aim to construct a Covid-19 concern index from several survey items using factor analysis. As a first step, we select the survey items, which should be the manifestation of the latent factor of Covid-19-related concern/anxiety.
These survey items are:
| Survey question (1-10 scale) | Original variable | Renamed variable name |
|---|---|---|
| How scared are you of the risk of getting sick? | q30 | q30_concern_infection_covid |
| How scared are you of the risk that someone in your family or network of friends will get COVID-19? | q31 | q31_concern_infection_friends |
| I feel very anxious about the health emergency. | q33_01 | q33_01_concern_situation |
| I think I have little control over whether I get the infection. | q33_02 | q33_02_concern_low_control |
| I am scared that I will not be able to survive if I get sick due to COVID-19 or I got sick and I was scared that I would not survive. | q33_03 | q33_03_concern_survival_covid |
| I thought about quitting my job / dropping out of school due to COVID-19. | q33_04 | q33_04_concern_change_employment |
| I am afraid of transmitting the coronavirus to others. | q33_05 | q33_05_concern_infecting_others |
After the initial selection, we analyze these survey items with a set of descriptive statistics. To follow the established principles pertaining to the factor analyses, we also split the sample into two randomly chosen halves (Cabrera-Nguyen 2010). The first half of the data set will be used for the Exploratory Factor Analysis, while the second half will be used by the Reliability and Confirmatory Factor Analyses (all functions from jmv package).
anx_items_descriptives <- jmv::descriptives(
data = data,
vars = vars("q30_concern_infection_covid",
"q31_concern_infection_friends",
"q33_01_concern_situation",
"q33_02_concern_low_control",
"q33_03_concern_survival_covid",
"q33_04_concern_change_employment",
"q33_05_concern_infecting_others"),
hist = TRUE,
min = FALSE,
max = FALSE)
# We also split the sample into two halves. The "training" half, on which we
# conduct the EFA analysis and "test" part, on which we
# test our construct through CFA.
set.seed(2021)
train_set <- data %>% slice_sample(n = 742)
test_set <- setdiff(data,train_set)
anx_items_descriptives$plots
anx_items_descriptives$descriptives
Descriptives
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
q30_concern_infection_covid q31_concern_infection_friends q33_01_concern_situation q33_02_concern_low_control q33_03_concern_survival_covid q33_04_concern_change_employment q33_05_concern_infecting_others
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
N 1482 1482 1484 1484 1484 1484 1484
Missing 2 2 0 0 0 0 0
Mean 4.30 5.78 5.70 4.36 3.24 1.82 5.76
Median 4.00 6.00 5.00 4.00 2.00 1.00 6.00
Standard deviation 2.36 2.61 2.76 2.56 2.70 2.09 3.17
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
In the next step, we conduct an Exploratory Factor Analysis on these variables.
In line with best practices, we conduct the assumption checks (KMO and Bartlett’s Sphericity tests), set a cutoff for eigenvalue of >1 and hide factor loading below 0.4.
The result is therefore a one-factor construct, which includes all of the variables, except for the q33_04_concern_change_employment, which does not seem to be a good fit for the manifestation of Covid-19 concern within this group of variables. We will exclude this variable in the next step.
# To conduct the EFA, we use the efa() function from the jmv package on
# the "train" data set (as opposed to the "test" dataset used for CFA).
jmv::efa(
data = train_set,
vars = vars("q30_concern_infection_covid",
"q31_concern_infection_friends",
"q33_01_concern_situation",
"q33_02_concern_low_control",
"q33_03_concern_survival_covid",
"q33_04_concern_change_employment",
"q33_05_concern_infecting_others"),
nFactorMethod = "eigen",
nFactors = 1,
minEigen = 1,
rotation = "promax",
hideLoadings = 0.4,
screePlot = TRUE,
factorSummary = TRUE,
kmo = TRUE,
bartlett = TRUE)
EXPLORATORY FACTOR ANALYSIS
Factor Loadings
───────────────────────────────────────────────────────────
1 Uniqueness
───────────────────────────────────────────────────────────
q30_concern_infection_covid 0.861 0.258
q31_concern_infection_friends 0.813 0.340
q33_01_concern_situation 0.577 0.667
q33_02_concern_low_control 0.443 0.804
q33_03_concern_survival_covid 0.454 0.794
q33_04_concern_change_employment 0.953
q33_05_concern_infecting_others 0.533 0.716
───────────────────────────────────────────────────────────
Note. 'Minimum residual' extraction method was used
in combination with a 'promax' rotation
FACTOR STATISTICS
Summary
──────────────────────────────────────────────────────────
Factor SS Loadings % of Variance Cumulative %
──────────────────────────────────────────────────────────
1 2.47 35.3 35.3
──────────────────────────────────────────────────────────
ASSUMPTION CHECKS
Bartlett's Test of Sphericity
─────────────────────────────
χ² df p
─────────────────────────────
1376 21 < .001
─────────────────────────────
KMO Measure of Sampling Adequacy
─────────────────────────────────────────────
MSA
─────────────────────────────────────────────
Overall 0.779
q30_concern_infection_covid 0.743
q31_concern_infection_friends 0.719
q33_01_concern_situation 0.883
q33_02_concern_low_control 0.849
q33_03_concern_survival_covid 0.823
q33_04_concern_change_employment 0.730
q33_05_concern_infecting_others 0.812
─────────────────────────────────────────────
Secondly, we conduct a Reliability Analysis of the Covid-19 concern factor. We use a cutoff value of 0.7 for both McDonald’s Omega and Cronbach’s Alpha. The scale passes this cutoff and the statistics would not be improved if any of the items were dropped.
# To conduct the reliability analysis, we use the reliability() function from the
# jmv package on the "test" data set (as opposed to the "train" dataset used for EFA).
jmv::reliability(
data = test_set,
vars = vars("q30_concern_infection_covid",
"q31_concern_infection_friends",
"q33_01_concern_situation",
"q33_02_concern_low_control",
"q33_03_concern_survival_covid",
"q33_05_concern_infecting_others"),
omegaScale = TRUE,
alphaItems = TRUE,
omegaItems = TRUE)
RELIABILITY ANALYSIS
Scale Reliability Statistics
─────────────────────────────────────────
Cronbach's α McDonald's ω
─────────────────────────────────────────
scale 0.784 0.797
─────────────────────────────────────────
Item Reliability Statistics
───────────────────────────────────────────────────────────────────
Cronbach's α McDonald's ω
───────────────────────────────────────────────────────────────────
q30_concern_infection_covid 0.719 0.726
q31_concern_infection_friends 0.724 0.738
q33_01_concern_situation 0.752 0.772
q33_02_concern_low_control 0.761 0.782
q33_03_concern_survival_covid 0.775 0.790
q33_05_concern_infecting_others 0.773 0.784
───────────────────────────────────────────────────────────────────
According to the commonly used cut-offs for estimating CFA fit, we report that the Standardized Root Mean Square Residual is 0.0521 (cut-off SRMR <0.08), which indicates a good fit. However, Root Mean Square Error of Approximation (90% CI) is 0.130-0.171 (cut-off < 0.08), the Comparative Fit Index is 0.887 (cut-off CFI ≥.90), and the chi-square test value is 159 (p < 0.001), which does not indicate a good-fit.
# To conduct the CFA, we use the cfa() function from the jmv package on the "test"
# data set (as opposed to the "train" dataset used for EFA).
jmv::cfa(
data = test_set,
factors = list(
list(
label = "Concern",
vars = c(
"q30_concern_infection_covid",
"q31_concern_infection_friends",
"q33_01_concern_situation",
"q33_02_concern_low_control",
"q33_03_concern_survival_covid",
"q33_05_concern_infecting_others"))),
resCov = list(),
ci = TRUE,
stdEst = TRUE,
factCovEst = FALSE,
fitMeasures = c("cfi", "tli", "rmsea", "srmr"),
corRes = TRUE)
CONFIRMATORY FACTOR ANALYSIS
Factor Loadings
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Factor Indicator Estimate SE Lower Upper Z p Stand. Estimate
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Concern q30_concern_infection_covid 1.99 0.0787 1.84 2.15 25.3 < .001 0.834
q31_concern_infection_friends 2.08 0.0869 1.91 2.25 23.9 < .001 0.800
q33_01_concern_situation 1.57 0.1015 1.37 1.77 15.4 < .001 0.566
q33_02_concern_low_control 1.27 0.0962 1.08 1.45 13.1 < .001 0.495
q33_03_concern_survival_covid 1.31 0.1037 1.11 1.51 12.6 < .001 0.480
q33_05_concern_infecting_others 1.70 0.1198 1.46 1.93 14.2 < .001 0.534
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
MODEL FIT
Test for Exact Fit
───────────────────────
χ² df p
───────────────────────
159 9 < .001
───────────────────────
Fit Measures
───────────────────────────────────────────────────────
CFI TLI SRMR RMSEA Lower Upper
───────────────────────────────────────────────────────
0.887 0.812 0.0521 0.150 0.130 0.171
───────────────────────────────────────────────────────
POST-HOC MODEL PERFORMANCE
Residuals for Observed Correlation Matrix
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
q30_concern_infection_covid q31_concern_infection_friends q33_01_concern_situation q33_02_concern_low_control q33_03_concern_survival_covid q33_05_concern_infecting_others
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
q30_concern_infection_covid 0.021 -0.028 -0.005 0.057 -0.075
q31_concern_infection_friends -0.016 -0.084 -0.099 0.098
q33_01_concern_situation 0.128 0.035 0.009
q33_02_concern_low_control 0.120 0.026
q33_03_concern_survival_covid -0.060
q33_05_concern_infecting_others
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
After Reliability Analysis and CFA, we combine the multiple variables into one named concern_index. We also render visualization and descriptive statistics for the new concern_index variable.
# Creating the Covid-19-related concern/anxiety index, consisting of the average of
# the values of the multiple variables selected through factor analysis to
# represent the underlying construct.
concern_index <- apply(cbind(data$q30_concern_infection_covid,
data$q31_concern_infection_friends,
data$q33_01_concern_situation,
data$q33_02_concern_low_control,
data$q33_03_concern_survival_covid,
data$q33_05_concern_infecting_others), 1, mean)
#Adding the vector as an column to the existing dataset.
data <- cbind(data, concern_index)
data_corr <- cbind(data_corr, concern_index)
#To summarize the concern_index variable, we use the descriptives()
# function from the jmv package.
anx_index_descriptives <- jmv::descriptives(
data = data,
missing = TRUE,
vars = "concern_index",
sd = TRUE,
median = FALSE,
pc = TRUE,
range = TRUE,
box = TRUE)
# Function to get the result from the correlation matrix into a data frame
flattenCorrMatrix <- function(cormat, pmat) {
ut <- upper.tri(cormat)
data.frame(
row = rownames(cormat)[row(cormat)[ut]],
column = rownames(cormat)[col(cormat)[ut]],
cor = (cormat)[ut],
p = pmat[ut]
)
}
#Correlation matrix using Spearman coefficient
corr_mtx <- rcorr(as.matrix(data_corr), type = "spearman")
# Selecting only significant correlates for PHQ8 (values with p>0.05 are excluded)
flattenCorrMatrix(corr_mtx$r, corr_mtx$P) %>% filter(p <= 0.05,
column %in% c("PHQ8_t")) %>%
arrange(desc(abs(cor)))
row column cor p
1 q02_age PHQ8_t -0.3113 0.000000000000000000
2 q04_children PHQ8_t 0.2593 0.000000000000000000
3 q03_relationship_type PHQ8_t -0.2092 0.000000000000000444
4 q18_02_soc_media PHQ8_t 0.1887 0.000000000000232259
5 q47_self_reporting_health PHQ8_t 0.1845 0.000000000001153300
6 q36_econ_worry PHQ8_t -0.1502 0.000000006065404623
7 q49_health_limitations PHQ8_t -0.1448 0.000000027075539144
8 q11_education PHQ8_t -0.1250 0.000001353406583249
9 q20_public_info PHQ8_t 0.1120 0.000015253877716948
10 q35_01_contact_close_family PHQ8_t 0.1024 0.000077824971264739
11 q48_chronic_illness PHQ8_t -0.0835 0.001404622458688554
12 q01_gender PHQ8_t -0.0814 0.001693246620673605
13 q40_smoking PHQ8_t -0.0677 0.009065341895034829
# Selecting only significant correlates for concern index (values with p>0.05 are excluded)
flattenCorrMatrix(corr_mtx$r, corr_mtx$P) %>% filter(p <= 0.05,
column %in% c("concern_index")) %>%
arrange(desc(abs(cor)))
row column cor p
1 PHQ8_t concern_index 0.2665 0.0000000000000000
2 q47_self_reporting_health concern_index 0.2550 0.0000000000000000
3 q48_chronic_illness concern_index -0.1933 0.0000000000000946
4 q49_health_limitations concern_index -0.1662 0.0000000001683467
5 q34_07_hand_washing concern_index -0.1169 0.0000064627762864
6 q36_econ_worry concern_index -0.1026 0.0000758838946182
7 q42_sport concern_index 0.0805 0.0019215744673602
8 q35_01_contact_close_family concern_index -0.0799 0.0020720004591863
9 q20_public_info concern_index 0.0738 0.0044853171363255
10 q35_03_contact_friends concern_index -0.0698 0.0071717066761323
11 q01_gender concern_index -0.0587 0.0238620102866376
12 q34_02_face_mask concern_index -0.0578 0.0259869544484910
anx_index_descriptives$plots
anx_index_descriptives$descriptives
Descriptives
───────────────────────────────────────
concern_index
───────────────────────────────────────
N 1482
Missing 2
Mean 4.86
Standard deviation 1.86
Range 9.00
Minimum 1.00
Maximum 10.0
25th percentile 3.50
50th percentile 4.83
75th percentile 6.17
───────────────────────────────────────
To explore our hypothesized pathway (see H13) between social media exposure and depression, partially mediated by Covid-19-related concerns and moderated by age (which is presumed to influence both the social media exposure and the depression pathway), we conduct a mediation-moderation analysis using the lavaan package, conceptually structured as a Hayes model nr. 76.
# Before running the model, we need to transform the social media string
# dummy (yes/no) back to its numeric form, with similar operation for gender.
levels(data$q18_02_soc_media) <- list("1" = "yes", "0" = "no")
levels(data$q01_gender) <- list("0" = "female", "1" = "male")
data$q01_gender <- as.numeric(as.character(data$q01_gender))
data$q18_02_soc_media <- as.numeric(as.character(data$q18_02_soc_media))
# Centering continuous variables with scaling
data_sem <- data %>%
filter(!is.na(concern_index)) %>%
mutate(concern_index.c = scale(concern_index, scale = TRUE),
PHQ8.c = scale(PHQ8_t, scale = TRUE),
q02_age.c = scale(q02_age, scale = TRUE))
# Labels for diagrams
labels_H76 <- list(X = "Social Media",
M = "Concern",
Y = "Depression",
W = "Age",
Z = "Gender")
pmacroModel(76,
labels = labels_H76,
xmargin = 0,
rady = 0.047,
radx = 0.09,
ylim = c(0.15, 0.8))
statisticalDiagram(76,
labels = labels_H76,
whatLabel = "name",
xmargin = 0.01,
rady = 0.03,
radx = 0.11,
ylim = c(0.06, 0.95),
xlim = c(0.01, 1))
In the second step, we specify the key pathways and run the analysis, while bootstrapping the confidence intervals.
# Mediation-moderation analysis (path analysis framework, SEM) using lavaan package.
# First, we specify the model pathways
spec_mod <- "
# Regressions
concern_index.c ~ a1*q18_02_soc_media + a2*q02_age.c + a3*q01_gender + a4*q18_02_soc_media:q02_age.c + a5*q18_02_soc_media:q01_gender
PHQ8.c ~ c1*q18_02_soc_media + c2*q02_age.c + c3*q01_gender + c4*q18_02_soc_media:q02_age.c + c5*q18_02_soc_media:q01_gender + b1*concern_index.c + b2*concern_index.c:q02_age.c + b3*concern_index.c:q01_gender
#Mean and variance of age and gender moderators
q02_age.c ~ q02_age.c.mean*1
q02_age.c ~~ q02_age.c.var*q02_age.c
q01_gender ~ q01_gender.mean*1
q01_gender ~~ q01_gender.var*q01_gender
# Effect specifications
XonM := a1 + a4*q02_age.c.mean + a5*q01_gender.mean
MonY := b1 + b2*q02_age.c.mean + b3*q01_gender.mean
indirect := (a1 + a4*q02_age.c.mean + a5*q01_gender.mean)*(b1 + b2*q02_age.c.mean + b3*q01_gender.mean)
direct := c1 + c4*q02_age.c.mean + c5*q01_gender.mean
total := direct + indirect
prop.mediated := indirect / total
# Component effects conditional on moderators (X = Social Media, M = Concern, Y = Depression, W = Age, Z = Gender)
XonM.mean.male := a1 + a4*q02_age.c.mean + a5*1
XonM.mean.female := a1 + a4*q02_age.c.mean + a5*0
XonM.blw.male := a1 + a4*(q02_age.c.mean - sqrt(q02_age.c.var)) + a5*1
XonM.blw.female := a1 + a4*(q02_age.c.mean - sqrt(q02_age.c.var)) + a5*0
XonM.blw.avg := a1 + a4*(q02_age.c.mean - sqrt(q02_age.c.var)) + a5*q01_gender.mean
XonM.abv.male := a1 + a4*(q02_age.c.mean + sqrt(q02_age.c.var)) + a5*1
XonM.abv.female := a1 + a4*(q02_age.c.mean + sqrt(q02_age.c.var)) + a5*0
XonM.abv.avg := a1 + a4*(q02_age.c.mean + sqrt(q02_age.c.var)) + a5*q01_gender.mean
MonY.mean.male := b1 + b2*q02_age.c.mean + b3*1
MonY.mean.female := b1 + b2*q02_age.c.mean + b3*0
MonY.blw.male := b1 + b2*(q02_age.c.mean - sqrt(q02_age.c.var)) + b3*1
MonY.blw.female := b1 + b2*(q02_age.c.mean - sqrt(q02_age.c.var)) + b3*0
MonY.blw.avg := b1 + b2*(q02_age.c.mean - sqrt(q02_age.c.var)) + b3*q01_gender.mean
MonY.abv.male := b1 + b2*(q02_age.c.mean + sqrt(q02_age.c.var)) + b3*1
MonY.abv.female := b1 + b2*(q02_age.c.mean + sqrt(q02_age.c.var)) + b3*0
MonY.abv.avg := b1 + b2*(q02_age.c.mean + sqrt(q02_age.c.var)) + b3*q01_gender.mean
# Indirect effects conditional on moderators
indirect.mean.male := (a1 + a4*q02_age.c.mean + a5*1)*(b1 + b2*q02_age.c.mean + b3*1)
indirect.mean.female := (a1 + a4*q02_age.c.mean + a5*0)*(b1 + b2*q02_age.c.mean + b3*0)
indirect.blw.male := (a1 + a4*(q02_age.c.mean - sqrt(q02_age.c.var)) + a5*1)*(b1 + b2*(q02_age.c.mean - sqrt(q02_age.c.var)) + b3*1)
indirect.blw.female := (a1 + a4*(q02_age.c.mean - sqrt(q02_age.c.var)) + a5*0)*(b1 + b2*(q02_age.c.mean - sqrt(q02_age.c.var)) + b3*0)
indirect.blw.avg := (a1 + a4*(q02_age.c.mean - sqrt(q02_age.c.var)) + a5*q01_gender.mean)*(b1 + b2*(q02_age.c.mean - sqrt(q02_age.c.var)) + b3*q01_gender.mean)
indirect.abv.male := (a1 + a4*(q02_age.c.mean + sqrt(q02_age.c.var)) + a5*1)*(b1 + b2*(q02_age.c.mean + sqrt(q02_age.c.var)) + b3*1)
indirect.abv.female := (a1 + a4*(q02_age.c.mean + sqrt(q02_age.c.var)) + a5*0)*(b1 + b2*(q02_age.c.mean + sqrt(q02_age.c.var)) + b3*0)
indirect.abv.avg := (a1 + a4*(q02_age.c.mean + sqrt(q02_age.c.var)) + a5*q01_gender.mean)*(b1 + b2*(q02_age.c.mean + sqrt(q02_age.c.var)) + b3*q01_gender.mean)
# Direct effects conditional on moderators
direct.mean.male := c1 + c4*q02_age.c.mean + c5*1
direct.mean.female := c1 + c4*q02_age.c.mean + c5*0
direct.blw.male := c1 + c4*(q02_age.c.mean - sqrt(q02_age.c.var)) + c5*1
direct.blw.female := c1 + c4*(q02_age.c.mean - sqrt(q02_age.c.var)) + c5*0
direct.blw.avg := c1 + c4*(q02_age.c.mean - sqrt(q02_age.c.var)) + c5*q01_gender.mean
direct.abv.male := c1 + c4*(q02_age.c.mean + sqrt(q02_age.c.var)) + c5*1
direct.abv.female := c1 + c4*(q02_age.c.mean + sqrt(q02_age.c.var)) + c5*0
direct.abv.avg := c1 + c4*(q02_age.c.mean + sqrt(q02_age.c.var)) + c5*q01_gender.mean
# Total effects conditional on moderators
total.mean.male := direct.mean.male + indirect.mean.male
total.mean.female := direct.mean.female + indirect.mean.female
total.blw.male := direct.blw.male + indirect.blw.male
total.blw.female := direct.blw.female + indirect.blw.female
total.blw.avg := direct.blw.avg + indirect.blw.avg
total.abv.male := direct.abv.male + indirect.abv.male
total.abv.female := direct.abv.female + indirect.abv.female
total.abv.avg := direct.abv.avg + indirect.abv.avg
# Proportion mediated conditional on moderators
prop.med.mean.male := indirect.mean.male / total.mean.male
prop.med.mean.female := indirect.mean.female / total.mean.female
prop.med.blw.male := indirect.blw.male / total.blw.male
prop.med.blw.female := indirect.blw.female / total.blw.female
prop.med.blw.avg := indirect.blw.avg / total.blw.avg
prop.med.abv.male := indirect.abv.male / total.abv.male
prop.med.abv.female := indirect.abv.female / total.abv.male
prop.med.abv.avg := indirect.abv.avg / total.abv.avg"
# For reproducibility of results (using bootstrap)
set.seed(2021)
# Secondly, we fit/estimate the model and we use bootstrap for robustness.
fit_mod <- lavaan::sem(model = spec_mod,
data = data_sem,
se = "bootstrap",
bootstrap = 1000)
# Labels for statistical diagrams
labels_stats_H76 <- list(X = "q18_02_soc_media",
M = "concern_index.c",
Y = "PHQ8.c",
W = "q02_age.c",
Z = "q01_gender")
statisticalDiagram(76,
labels = labels_stats_H76,
fit = fit_mod,
whatLabel = "est",
xmargin = 0.01,
rady = 0.03,
radx = 0.158,
ylim = c(0.06, 0.95),
xlim = c(0.01, 1))
statisticalDiagram(76,
labels = labels_stats_H76,
fit = fit_mod,
whatLabel = "std",
xmargin = 0.01,
rady = 0.03,
radx = 0.158,
ylim = c(0.06, 0.95),
xlim = c(0.01, 1))
lavaan::summary(fit_mod,
rsquare = TRUE,
ci = TRUE,
fit.measures = TRUE,
standardize = TRUE)
lavaan 0.6-9 ended normally after 40 iterations
Estimator ML
Optimization method NLMINB
Number of model parameters 21
Number of observations 1482
Model Test User Model:
Test statistic 1543.409
Degrees of freedom 13
P-value (Chi-square) 0.000
Model Test Baseline Model:
Test statistic 1866.058
Degrees of freedom 26
P-value 0.000
User Model versus Baseline Model:
Comparative Fit Index (CFI) 0.168
Tucker-Lewis Index (TLI) -0.663
Loglikelihood and Information Criteria:
Loglikelihood user model (H0) -7077.817
Loglikelihood unrestricted model (H1) -6306.112
Akaike (AIC) 14197.634
Bayesian (BIC) 14308.958
Sample-size adjusted Bayesian (BIC) 14242.247
Root Mean Square Error of Approximation:
RMSEA 0.282
90 Percent confidence interval - lower 0.270
90 Percent confidence interval - upper 0.294
P-value RMSEA <= 0.05 0.000
Standardized Root Mean Square Residual:
SRMR 0.139
Parameter Estimates:
Standard errors Bootstrap
Number of requested bootstrap draws 1000
Number of successful bootstrap draws 1000
Regressions:
Estimate Std.Err z-value P(>|z|) ci.lower ci.upper
concern_index.c ~
q18_02_s_ (a1) -0.029 0.087 -0.333 0.739 -0.198 0.153
q02_age.c (a2) 0.004 0.030 0.131 0.896 -0.053 0.066
q01_gendr (a3) -0.171 0.067 -2.567 0.010 -0.307 -0.040
q18_02__: (a4) -0.110 0.078 -1.408 0.159 -0.264 0.047
q18_02__: (a5) 0.146 0.125 1.169 0.243 -0.102 0.388
PHQ8.c ~
q18_02_s_ (c1) 0.083 0.068 1.218 0.223 -0.050 0.218
q02_age.c (c2) -0.260 0.028 -9.443 0.000 -0.313 -0.205
q01_gendr (c3) -0.244 0.063 -3.870 0.000 -0.376 -0.120
q18_02__: (c4) -0.153 0.062 -2.473 0.013 -0.272 -0.029
q18_02__: (c5) 0.072 0.120 0.604 0.546 -0.158 0.316
cncrn_nd. (b1) 0.242 0.029 8.219 0.000 0.183 0.299
cn_.:02_. (b2) 0.006 0.027 0.218 0.828 -0.045 0.058
cnc_.:01_ (b3) 0.081 0.057 1.422 0.155 -0.038 0.192
Std.lv Std.all
-0.029 -0.013
0.004 0.004
-0.171 -0.078
-0.110 -0.052
0.146 0.040
0.083 0.038
-0.260 -0.264
-0.244 -0.112
-0.153 -0.073
0.072 0.020
0.242 0.246
0.006 0.006
0.081 0.043
Intercepts:
Estimate Std.Err z-value P(>|z|) ci.lower ci.upper
q02_g.c (q02_) -0.000 0.026 -0.000 1.000 -0.052 0.050
q01_gnd (q01_) 0.290 0.012 24.228 0.000 0.267 0.314
.cncrn_. 0.028 0.037 0.772 0.440 -0.043 0.100
.PHQ8.c 0.020 0.033 0.601 0.548 -0.045 0.083
Std.lv Std.all
-0.000 -0.000
0.290 0.639
0.028 0.028
0.020 0.020
Variances:
Estimate Std.Err z-value P(>|z|) ci.lower ci.upper
q02_g.c (q02_) 0.999 0.021 46.973 0.000 0.956 1.040
q01_gnd (q01_) 0.206 0.005 40.869 0.000 0.195 0.215
.cncrn_. 0.992 0.034 29.181 0.000 0.923 1.057
.PHQ8.c 0.810 0.025 31.751 0.000 0.757 0.855
Std.lv Std.all
0.999 1.000
0.206 1.000
0.992 0.989
0.810 0.836
R-Square:
Estimate
concern_indx.c 0.011
PHQ8.c 0.164
Defined Parameters:
Estimate Std.Err z-value P(>|z|) ci.lower ci.upper
XonM 0.013 0.080 0.168 0.866 -0.143 0.173
MonY 0.265 0.025 10.780 0.000 0.218 0.315
indirect 0.004 0.021 0.167 0.867 -0.038 0.047
direct 0.103 0.062 1.675 0.094 -0.021 0.218
total 0.107 0.065 1.640 0.101 -0.027 0.228
prop.mediated 0.033 8.497 0.004 0.997 -1.117 1.141
XonM.mean.male 0.117 0.119 0.986 0.324 -0.132 0.358
XonM.mean.feml -0.029 0.088 -0.332 0.740 -0.199 0.153
XonM.blw.male 0.228 0.108 2.110 0.035 0.027 0.439
XonM.blw.femal 0.081 0.081 1.009 0.313 -0.075 0.248
XonM.blw.avg 0.124 0.069 1.797 0.072 -0.008 0.264
XonM.abv.male 0.007 0.170 0.041 0.967 -0.337 0.357
XonM.abv.femal -0.139 0.145 -0.960 0.337 -0.428 0.153
XonM.abv.avg -0.097 0.142 -0.682 0.495 -0.372 0.197
MonY.mean.male 0.323 0.048 6.734 0.000 0.227 0.415
MonY.mean.feml 0.242 0.029 8.224 0.000 0.184 0.300
MonY.blw.male 0.317 0.051 6.166 0.000 0.217 0.422
MonY.blw.femal 0.236 0.043 5.512 0.000 0.149 0.323
MonY.blw.avg 0.260 0.037 6.959 0.000 0.186 0.330
MonY.abv.male 0.329 0.058 5.672 0.000 0.208 0.432
MonY.abv.femal 0.247 0.036 6.858 0.000 0.174 0.317
MonY.abv.avg 0.271 0.035 7.741 0.000 0.196 0.339
indirect.mn.ml 0.038 0.040 0.956 0.339 -0.041 0.121
indirct.mn.fml -0.007 0.021 -0.330 0.741 -0.049 0.037
indirct.blw.ml 0.072 0.037 1.973 0.049 0.009 0.148
indrct.blw.fml 0.019 0.020 0.980 0.327 -0.018 0.061
indirct.blw.vg 0.032 0.019 1.727 0.084 -0.002 0.070
indirect.bv.ml 0.002 0.057 0.040 0.968 -0.117 0.124
indirct.bv.fml -0.035 0.037 -0.941 0.346 -0.106 0.034
indirect.bv.vg -0.026 0.039 -0.672 0.501 -0.099 0.056
direct.mean.ml 0.155 0.110 1.402 0.161 -0.059 0.369
direct.men.fml 0.083 0.068 1.220 0.222 -0.053 0.220
direct.blw.mal 0.308 0.108 2.845 0.004 0.100 0.532
direct.blw.fml 0.236 0.074 3.168 0.002 0.094 0.381
direct.blw.avg 0.257 0.066 3.878 0.000 0.127 0.387
direct.abv.mal 0.001 0.143 0.010 0.992 -0.284 0.281
direct.abv.fml -0.071 0.106 -0.666 0.506 -0.286 0.143
direct.abv.avg -0.050 0.105 -0.476 0.634 -0.272 0.160
total.mean.mal 0.193 0.116 1.658 0.097 -0.029 0.416
total.mean.fml 0.075 0.071 1.069 0.285 -0.072 0.207
total.blw.male 0.380 0.110 3.448 0.001 0.172 0.609
total.blw.feml 0.255 0.077 3.315 0.001 0.105 0.405
total.blw.avg 0.289 0.068 4.230 0.000 0.155 0.423
total.abv.male 0.004 0.156 0.024 0.981 -0.293 0.305
total.abv.feml -0.105 0.113 -0.936 0.349 -0.326 0.136
total.abv.avg -0.076 0.113 -0.676 0.499 -0.305 0.158
prop.med.mn.ml 0.197 2.994 0.066 0.948 -1.243 1.394
prop.md.mn.fml -0.093 130.215 -0.001 0.999 -2.985 2.407
prop.md.blw.ml 0.190 0.127 1.501 0.133 0.023 0.490
prp.md.blw.fml 0.075 0.099 0.756 0.450 -0.091 0.275
prop.md.blw.vg 0.111 0.152 0.733 0.463 -0.009 0.285
prop.med.bv.ml 0.612 7.680 0.080 0.937 -5.490 4.844
prop.md.bv.fml -9.224 17.314 -0.533 0.594 -4.765 4.681
prop.med.bv.vg 0.345 1092.623 0.000 1.000 -3.647 3.689
Std.lv Std.all
0.013 0.013
0.265 0.273
0.004 0.003
0.103 0.050
0.107 0.054
0.033 0.064
0.117 0.027
-0.029 -0.013
0.228 0.079
0.081 0.039
0.124 0.064
0.007 -0.025
-0.139 -0.065
-0.097 -0.039
0.323 0.289
0.242 0.246
0.317 0.283
0.236 0.240
0.260 0.267
0.329 0.294
0.247 0.252
0.271 0.279
0.038 0.008
-0.007 -0.003
0.072 0.022
0.019 0.009
0.032 0.017
0.002 -0.007
-0.035 -0.016
-0.026 -0.011
0.155 0.058
0.083 0.038
0.308 0.131
0.236 0.111
0.257 0.124
0.001 -0.016
-0.071 -0.036
-0.050 -0.023
0.193 0.065
0.075 0.034
0.380 0.153
0.255 0.120
0.289 0.141
0.004 -0.023
-0.105 -0.052
-0.076 -0.034
0.197 0.119
-0.093 -0.093
0.190 0.145
0.075 0.078
0.111 0.122
0.612 0.318
-9.224 0.710
0.345 0.323
estimates <- parameterEstimates(fit_mod, standardized = TRUE) %>%
filter(op == "~") %>%
select(-c(std.nox))
p_adj <- p.adjust(estimates$pvalue, method = "holm")
estimates <- cbind(estimates, p_adj)
kableExtra::kbl(estimates) %>%
kableExtra::kable_classic(full_width = FALSE, lightable_options = c("striped")) %>%
kableExtra::row_spec(which(estimates$p_adj < 0.05), bold = TRUE)
| lhs | op | rhs | label | est | se | z | pvalue | ci.lower | ci.upper | std.lv | std.all | p_adj |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| concern_index.c | ~ | q18_02_soc_media | a1 | -0.029 | 0.087 | -0.333 | 0.739 | -0.198 | 0.153 | -0.029 | -0.013 | 1.000 |
| concern_index.c | ~ | q02_age.c | a2 | 0.004 | 0.030 | 0.131 | 0.896 | -0.053 | 0.066 | 0.004 | 0.004 | 1.000 |
| concern_index.c | ~ | q01_gender | a3 | -0.171 | 0.067 | -2.567 | 0.010 | -0.307 | -0.040 | -0.171 | -0.078 | 0.103 |
| concern_index.c | ~ | q18_02_soc_media:q02_age.c | a4 | -0.110 | 0.078 | -1.408 | 0.159 | -0.264 | 0.047 | -0.110 | -0.052 | 1.000 |
| concern_index.c | ~ | q18_02_soc_media:q01_gender | a5 | 0.146 | 0.125 | 1.169 | 0.243 | -0.102 | 0.388 | 0.146 | 0.040 | 1.000 |
| PHQ8.c | ~ | q18_02_soc_media | c1 | 0.083 | 0.068 | 1.218 | 0.223 | -0.050 | 0.218 | 0.083 | 0.038 | 1.000 |
| PHQ8.c | ~ | q02_age.c | c2 | -0.260 | 0.028 | -9.443 | 0.000 | -0.313 | -0.205 | -0.260 | -0.264 | 0.000 |
| PHQ8.c | ~ | q01_gender | c3 | -0.244 | 0.063 | -3.870 | 0.000 | -0.376 | -0.120 | -0.244 | -0.112 | 0.001 |
| PHQ8.c | ~ | q18_02_soc_media:q02_age.c | c4 | -0.153 | 0.062 | -2.473 | 0.013 | -0.272 | -0.029 | -0.153 | -0.073 | 0.120 |
| PHQ8.c | ~ | q18_02_soc_media:q01_gender | c5 | 0.072 | 0.120 | 0.604 | 0.546 | -0.158 | 0.316 | 0.072 | 0.020 | 1.000 |
| PHQ8.c | ~ | concern_index.c | b1 | 0.242 | 0.029 | 8.219 | 0.000 | 0.183 | 0.299 | 0.242 | 0.246 | 0.000 |
| PHQ8.c | ~ | concern_index.c:q02_age.c | b2 | 0.006 | 0.027 | 0.218 | 0.828 | -0.045 | 0.058 | 0.006 | 0.006 | 1.000 |
| PHQ8.c | ~ | concern_index.c:q01_gender | b3 | 0.081 | 0.057 | 1.422 | 0.155 | -0.038 | 0.192 | 0.081 | 0.043 | 1.000 |
parameters <- parameterEstimates(fit_mod, standardized = TRUE) %>%
filter(op == ":=") %>%
select(-c(op, lhs, rhs, std.nox))
p_adj <- p.adjust(parameters$pvalue, method = "holm")
parameters <- cbind(parameters, p_adj)
kableExtra::kbl(parameters) %>%
kableExtra::kable_classic(full_width = FALSE, lightable_options = c("striped")) %>%
kableExtra::row_spec(which(parameters$p_adj < 0.05), bold = TRUE)
| label | est | se | z | pvalue | ci.lower | ci.upper | std.lv | std.all | p_adj |
|---|---|---|---|---|---|---|---|---|---|
| XonM | 0.013 | 0.080 | 0.168 | 0.866 | -0.143 | 0.173 | 0.013 | 0.013 | 1.000 |
| MonY | 0.265 | 0.025 | 10.780 | 0.000 | 0.218 | 0.315 | 0.265 | 0.273 | 0.000 |
| indirect | 0.004 | 0.021 | 0.167 | 0.867 | -0.038 | 0.047 | 0.004 | 0.003 | 1.000 |
| direct | 0.103 | 0.062 | 1.675 | 0.094 | -0.021 | 0.218 | 0.103 | 0.050 | 1.000 |
| total | 0.107 | 0.065 | 1.640 | 0.101 | -0.027 | 0.228 | 0.107 | 0.054 | 1.000 |
| prop.mediated | 0.033 | 8.497 | 0.004 | 0.997 | -1.117 | 1.141 | 0.033 | 0.064 | 1.000 |
| XonM.mean.male | 0.117 | 0.119 | 0.986 | 0.324 | -0.132 | 0.358 | 0.117 | 0.027 | 1.000 |
| XonM.mean.female | -0.029 | 0.088 | -0.332 | 0.740 | -0.199 | 0.153 | -0.029 | -0.013 | 1.000 |
| XonM.blw.male | 0.228 | 0.108 | 2.110 | 0.035 | 0.027 | 0.439 | 0.228 | 0.079 | 1.000 |
| XonM.blw.female | 0.081 | 0.081 | 1.009 | 0.313 | -0.075 | 0.248 | 0.081 | 0.039 | 1.000 |
| XonM.blw.avg | 0.124 | 0.069 | 1.797 | 0.072 | -0.008 | 0.264 | 0.124 | 0.064 | 1.000 |
| XonM.abv.male | 0.007 | 0.170 | 0.041 | 0.967 | -0.337 | 0.357 | 0.007 | -0.025 | 1.000 |
| XonM.abv.female | -0.139 | 0.145 | -0.960 | 0.337 | -0.428 | 0.153 | -0.139 | -0.065 | 1.000 |
| XonM.abv.avg | -0.097 | 0.142 | -0.682 | 0.495 | -0.372 | 0.197 | -0.097 | -0.039 | 1.000 |
| MonY.mean.male | 0.323 | 0.048 | 6.734 | 0.000 | 0.227 | 0.415 | 0.323 | 0.289 | 0.000 |
| MonY.mean.female | 0.242 | 0.029 | 8.224 | 0.000 | 0.184 | 0.300 | 0.242 | 0.246 | 0.000 |
| MonY.blw.male | 0.317 | 0.051 | 6.166 | 0.000 | 0.217 | 0.422 | 0.317 | 0.283 | 0.000 |
| MonY.blw.female | 0.236 | 0.043 | 5.512 | 0.000 | 0.149 | 0.323 | 0.236 | 0.240 | 0.000 |
| MonY.blw.avg | 0.260 | 0.037 | 6.959 | 0.000 | 0.186 | 0.330 | 0.260 | 0.267 | 0.000 |
| MonY.abv.male | 0.329 | 0.058 | 5.672 | 0.000 | 0.208 | 0.432 | 0.329 | 0.294 | 0.000 |
| MonY.abv.female | 0.247 | 0.036 | 6.858 | 0.000 | 0.174 | 0.317 | 0.247 | 0.252 | 0.000 |
| MonY.abv.avg | 0.271 | 0.035 | 7.741 | 0.000 | 0.196 | 0.339 | 0.271 | 0.279 | 0.000 |
| indirect.mean.male | 0.038 | 0.040 | 0.956 | 0.339 | -0.041 | 0.121 | 0.038 | 0.008 | 1.000 |
| indirect.mean.female | -0.007 | 0.021 | -0.330 | 0.741 | -0.049 | 0.037 | -0.007 | -0.003 | 1.000 |
| indirect.blw.male | 0.072 | 0.037 | 1.973 | 0.049 | 0.009 | 0.148 | 0.072 | 0.022 | 1.000 |
| indirect.blw.female | 0.019 | 0.020 | 0.980 | 0.327 | -0.018 | 0.061 | 0.019 | 0.009 | 1.000 |
| indirect.blw.avg | 0.032 | 0.019 | 1.727 | 0.084 | -0.002 | 0.070 | 0.032 | 0.017 | 1.000 |
| indirect.abv.male | 0.002 | 0.057 | 0.040 | 0.968 | -0.117 | 0.124 | 0.002 | -0.007 | 1.000 |
| indirect.abv.female | -0.035 | 0.037 | -0.941 | 0.346 | -0.106 | 0.034 | -0.035 | -0.016 | 1.000 |
| indirect.abv.avg | -0.026 | 0.039 | -0.672 | 0.501 | -0.099 | 0.056 | -0.026 | -0.011 | 1.000 |
| direct.mean.male | 0.155 | 0.110 | 1.402 | 0.161 | -0.059 | 0.369 | 0.155 | 0.058 | 1.000 |
| direct.mean.female | 0.083 | 0.068 | 1.220 | 0.222 | -0.053 | 0.220 | 0.083 | 0.038 | 1.000 |
| direct.blw.male | 0.308 | 0.108 | 2.845 | 0.004 | 0.100 | 0.532 | 0.308 | 0.131 | 0.177 |
| direct.blw.female | 0.236 | 0.074 | 3.168 | 0.002 | 0.094 | 0.381 | 0.236 | 0.111 | 0.063 |
| direct.blw.avg | 0.257 | 0.066 | 3.878 | 0.000 | 0.127 | 0.387 | 0.257 | 0.124 | 0.005 |
| direct.abv.male | 0.001 | 0.143 | 0.010 | 0.992 | -0.284 | 0.281 | 0.001 | -0.016 | 1.000 |
| direct.abv.female | -0.071 | 0.106 | -0.666 | 0.506 | -0.286 | 0.143 | -0.071 | -0.036 | 1.000 |
| direct.abv.avg | -0.050 | 0.105 | -0.476 | 0.634 | -0.272 | 0.160 | -0.050 | -0.023 | 1.000 |
| total.mean.male | 0.193 | 0.116 | 1.658 | 0.097 | -0.029 | 0.416 | 0.193 | 0.065 | 1.000 |
| total.mean.female | 0.075 | 0.071 | 1.069 | 0.285 | -0.072 | 0.207 | 0.075 | 0.034 | 1.000 |
| total.blw.male | 0.380 | 0.110 | 3.448 | 0.001 | 0.172 | 0.609 | 0.380 | 0.153 | 0.024 |
| total.blw.female | 0.255 | 0.077 | 3.315 | 0.001 | 0.105 | 0.405 | 0.255 | 0.120 | 0.038 |
| total.blw.avg | 0.289 | 0.068 | 4.230 | 0.000 | 0.155 | 0.423 | 0.289 | 0.141 | 0.001 |
| total.abv.male | 0.004 | 0.156 | 0.024 | 0.981 | -0.293 | 0.305 | 0.004 | -0.023 | 1.000 |
| total.abv.female | -0.105 | 0.113 | -0.936 | 0.349 | -0.326 | 0.136 | -0.105 | -0.052 | 1.000 |
| total.abv.avg | -0.076 | 0.113 | -0.676 | 0.499 | -0.305 | 0.158 | -0.076 | -0.034 | 1.000 |
| prop.med.mean.male | 0.197 | 2.994 | 0.066 | 0.948 | -1.243 | 1.394 | 0.197 | 0.119 | 1.000 |
| prop.med.mean.female | -0.093 | 130.215 | -0.001 | 0.999 | -2.985 | 2.407 | -0.093 | -0.093 | 1.000 |
| prop.med.blw.male | 0.190 | 0.127 | 1.501 | 0.133 | 0.023 | 0.490 | 0.190 | 0.145 | 1.000 |
| prop.med.blw.female | 0.075 | 0.099 | 0.756 | 0.450 | -0.091 | 0.275 | 0.075 | 0.078 | 1.000 |
| prop.med.blw.avg | 0.111 | 0.152 | 0.733 | 0.463 | -0.009 | 0.285 | 0.111 | 0.122 | 1.000 |
| prop.med.abv.male | 0.612 | 7.680 | 0.080 | 0.937 | -5.490 | 4.844 | 0.612 | 0.318 | 1.000 |
| prop.med.abv.female | -9.224 | 17.314 | -0.533 | 0.594 | -4.765 | 4.681 | -9.224 | 0.710 | 1.000 |
| prop.med.abv.avg | 0.345 | 1092.623 | 0.000 | 1.000 | -3.647 | 3.689 | 0.345 | 0.323 | 1.000 |
Part of the survey, q50_comment, was dedicated to the comments of the respondents on their situation. To visualize this textual data, we use two pairs of two word clouds. Unfortunately, this survey item was used only in the Czech version of the survey.
First Word Cloud pair visualizes the most common tokens and lemma (size and color represents frequency of the word).
# Remove stop words - first, we load the public stop word list
stop_words_cz <- read_csv(
"https://raw.githubusercontent.com/stopwords-iso/stopwords-cs/master/stopwords-cs.txt",
col_names = "word")
# Should the above link become obsolete, alternative source can be reached
# using "stopwords" library:
# stop_words_cz <- as_tibble_col(stopwords::stopwords("cs",
# source = "stopwords-iso"),
# column_name = "word")
# Reshape the data frame into one column called "word"
tidy_dat <- gather(dplyr::as_tibble(data$q50_comment), key, word) %>%
dplyr::select(word)
# STEP 1: Tokenization of the q50 responses
# Tokenize - one word per row of a dataframe/tibble
tokens <- tidy_dat %>%
unnest_tokens(word, word) %>%
dplyr::count(word, sort = TRUE) %>%
ungroup()
# Removing stop words by using anti_join() applied on the stop words list
tokens_clean <- tokens %>%
anti_join(stop_words_cz)
# Next, we remove numbers (optional step)
nums <- tokens_clean %>%
dplyr::filter(str_detect(word, "^[0-9]")) %>%
dplyr::select(word) %>%
unique()
tokens_clean <- tokens_clean %>%
anti_join(nums, by = "word")
# We can also remove unique stop words that are still present (optional step)
uni_sw <- data.frame(word = c("např"))
tokens_clean <- tokens_clean %>%
anti_join(uni_sw, by = "word")
# Define a color palette for the Word Cloud
palette <- brewer.pal(8, "Dark2")
# STEP 2: Lemmatization of tokens, using udpipe package
# Creation of uncounted tokens table
tokens_uncounted <- tidy_dat %>%
unnest_tokens(word, word)
# Fitting the udpipe model with downloaded Czech model
udpipe_tokens_lemma <- udpipe(x = tokens_uncounted$word, object = "czech-pdt")
# Extracting resulting lemma column from the model, counting frequency
tidy_dat_lemma <- udpipe_tokens_lemma %>%
select(lemma) %>%
rename(word = lemma) %>%
dplyr::count(word, sort = TRUE)
# Removing stop words by using anti_join() applied on the stop words list
tokens_clean_lemma <- tidy_dat_lemma %>%
anti_join(stop_words_cz)
# Next, we remove numbers (optional step)
nums_lemma <- tokens_clean_lemma %>%
dplyr::filter(str_detect(word, "^[0-9]")) %>%
dplyr::select(word) %>%
unique()
tokens_clean_lemma <- tokens_clean_lemma %>%
anti_join(nums_lemma, by = "word")
# We can also remove unique stop words that are still present (optional step)
uniq_lemma <- tibble(word = c(NA))
tokens_clean_lemma <- tokens_clean_lemma %>%
anti_join(uniq_lemma, by = "word")
set.seed(2021)
tokens_clean %>% with(wordcloud(word,
n,
random.order = FALSE,
scale = c(7,.5),
min.freq = 1,
max.words = 100,
colors = palette))
set.seed(2021)
tokens_clean_lemma %>% with(wordcloud(word,
n,
random.order = FALSE,
scale = c(11,.7),
min.freq = 1,
max.words = 100,
colors = palette))
Second Word Cloud pair uses sentiment analysis technique to create two distinct word clouds (using only lemma, not tokens), one visualizes only words with positive emotional sentiment, while the second only words with negative sentiment.
# First, we load Czech Subjectivity Lexicon from ÚFAL MFF, which assesses
# sentiment for every word as positive or negative
lindat_repository <- "https://lindat.mff.cuni.cz/repository/"
lindat_path <- "xmlui/bitstream/handle/11858/00-097C-0000-0022-FF60-B/"
lindat_file_name <- "sublex_1_0.csv?sequence=1&isAllowed=y"
sentiment_cz <- read_delim(paste0(lindat_repository, lindat_path, lindat_file_name),
"\t",
escape_double = FALSE,
col_names = FALSE,
trim_ws = TRUE) %>%
rename("word" = "X3", "sentiment" = "X4")
# Remove extra symbols
sentiment_cz$word <- str_remove(sentiment_cz$word, pattern = "_.*")
# Next, we create tidy tibble with tokens created in the previous section
# and we use inner_join function to separately save only
# the tokens with positive and negative valency
tokens_sentiment_positive <- tokens_clean_lemma %>%
inner_join(sentiment_cz %>%
filter(sentiment == "POS")) %>%
transmute(word, n) %>%
arrange(desc(n))
tokens_sentiment_negative <- tokens_clean_lemma %>%
inner_join(sentiment_cz %>%
filter(sentiment == "NEG")) %>%
transmute(word, n) %>%
arrange(desc(n))
set.seed(2021)
tokens_sentiment_positive %>% with(wordcloud(word,
n,
random.order = FALSE,
scale = c(2, 3.5),
max.words = 45,
min.freq = 1,
colors = palette))
set.seed(2021)
tokens_sentiment_negative %>% with(wordcloud(word,
n,
random.order = FALSE,
scale = c(2, 3.5),
max.words = 45,
min.freq = 1,
colors = palette))